Skip to main content.
home | support | download

Back to List Archive

Re: not ignoring content (leave those files alone!)

From: Linda W. (that's swishey, not squishey!) <swishey(at)not-real.tlinx.org>
Date: Sun Jun 11 2006 - 05:08:10 GMT
> Yep, NoContents should index just the file path.  Sounds like that is
> working since you're seeing "5 words" for those files.
----
	NoContents indexes part of the _content_ if the file has the default
filetype (HTML*) -- it looks for some random HTML fields.

> As for why one file appeared to take 5 minutes to index perhaps the file
> immediately following or preceding it took 5 minutes to index?  Might be
> worth trying to index that one file by itself to verify.
---
	I used Process Explorer to look at the process -- it still had
the file open to the "rogue .mbf" file.  The files before and after were
fairly short.

The problem was/is that for every file extension one includes in "NoContents"
one must also change the filetype from the Default HTML*.  Changing it to
TXT should be sufficient to stop the scanning, since HTML* is currently the only 
type that is scanned for Indexing keywords
I'd feel safer being able to assign some "Opaque" filetype to files, and not
calling a file "TXT" when it really isn't (future wishlist item?)...  How
about a "ReallyNoContent" switch that doesn't look for HTML tags for headers 
within a binary....:-)..

-l
Received on Sat Jun 10 22:08:15 2006