Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Searching for a filename, not contents of a file

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Fri Oct 09 2009 - 02:50:05 GMT
Chad Kellerman wrote on 10/8/09 3:06 PM:
> 
> 
> On Thu, Oct 8, 2009 at 3:59 PM, Peter Karman <peter@peknet.com
> <mailto:peter@peknet.com>> wrote:
> 
>     Chad Kellerman wrote on 10/8/09 2:57 PM:
>     >
>     >
>     > On Thu, Oct 8, 2009 at 3:53 PM, Peter Karman <peter@peknet.com
>     <mailto:peter@peknet.com>
>     > <mailto:peter@peknet.com <mailto:peter@peknet.com>>> wrote:
>     >
>     >     Chad Kellerman wrote on 10/8/09 2:51 PM:
>     >     > Thanks for the link, but I added
>     >     >
>     >     > MetaNames swishdocpath
>     >     >
>     >     > to my config file, ran the reindex job, yet I still don't
>     get the
>     >     > filename in the search?
>     >     >
>     >     > Is there something conflicting in my conf file?
>     >
>     >     I didn't notice anything obvious.
>     >
>     >     How did you search? You must specify the metaname:
>     >
>     >      swish-e -w foo or swishdocpath=foo
>     >
>     > [user@host11]# swish-e -w swishdocpath=sunckell.txt
>     > # SWISH format: 2.4.6
>     > # Search words: swishdocpath=sunckell.txt
>     > # Removed stopwords:
>     > err: no results
>     >
>     >
>     > When I indexed with that config options file here are the results:
> 
>     try indexing a single file, with the various -T options on, to see
>     how your doc
>     is indexed.
> 
> I tried:
> swish-e -T INDEX_WORDS_FULL | grep sunckell
>  Meta:1 /safe/Production/sunckell.txt Freq:1 Pos/Struct:1/1
>  Meta:10 /safe/Production/sunckell.txt Freq:1 Pos/Struct:2/1
>  Meta:10 /safe/Production/sunckell.txt Freq:1 Pos/Struct:1/1
> sunckell
>  Meta:10 /safe/Production/sunckell.txt Freq:1 Pos/Struct:3/1
>  Meta:10 /safe/Production/sunckell.txt Freq:1 Pos/Struct:4/1
>  Meta:1 /safe/Production/sunckell.txt Freq:1 Pos/Struct:2/1
> 
> 
> But now I think it has something to do with the extention?
> 
> swish-e -w swishdocpath=sunckell
> # SWISH format: 2.4.6
> # Search words: swishdocpath=sunckell
> # Removed stopwords:
> # Number of hits: 1
> # Search time: 0.001 seconds
> # Run time: 0.020 seconds
> 1000 /safe/Production/sunckell.txt "sunckell.txt" 12
> 

the dot (.) is not a WordCharacter and so the query is being parsed as two terms
'sunckell' and 'txt'. Because the query parser defaults to looking at
swishdefault as the metaname for each term, you need to force the query parser
to treat it as a phrase:

 swish-e -w 'swishdocpath="myfile.txt"'

Otherwise your query is being expanded to something like:

 swishdocpath=sunckell AND swishdefault=txt

which isn't of course what you want.

That is awkward imo.

You probably want instead something like this in your config:

MetaNameAlias swishdefault swishdocpath

which will put the contents of the file name under the swishdefault MetaName.

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Oct 8 22:50:09 2009