Re: NoContents of files with no extension

From: Bill Moseley <moseley(at)>
Date: Wed Dec 19 2001 - 17:23:39 GMT
At 05:15 PM 12/19/01 -0000, Jonathan Feldman wrote:
>I should say that (somehwat perversely??) i am combining 
>SWISH++ which seems to extract word docs excel and ppt fine 
>but not index filenames, with swish-e which does the latter but not 
>the former. I am happy to be corrected on all points here.

Hum, David, wasn't there some issue with catdoc on Windows?  Did you fix
something there?

>So as my users( apologies for the possessiveness) will search for 
>file names as well as for content of files, the front end will offer 
>them this choice.

Ah, you want to *search* for filenames.

(all in 2.1-dev, of course)

Just add:

  MetaNames swishdocpath

Then you can limit searches to word in the document path

  ./swihs-e -w swishdocpath=(gif)

then it will find any docs with a path part of "gif"  (assuming you don't
index the period.

There's also ExtractPath where you can use a regular expression to extract
parts of a path.  For example (not tested)

   ExtractPath the_extension regex !^.+\.([^/]+)$!$1!


   ./swish-e -w foo the_extension=(doc)

limit searches for the word "foo" in files that end in .doc.

Bill Moseley
