On Sun, Jun 06, 2004 at 11:56:53AM -0400, Kaplan, Andrew H. wrote:
> Hi there --
> I'm sorry for sounding stupid, but could you elaborate on making sure
> that "Head" is in the index? Also, aside from the cgi script, what is
> the command syntax I would use to search the index? Thanks.
So, the situation is you index some files and then you search for "head"
and it says "no results" but you are sure it should be found because you
know it's in the file "body_parts.html".
So then you run swish like this:
swish-e -c myconfig -i body_parts.html -T indexed_words | grep head
and you see something like:
Adding:[1:swishdefault(1)] 'head' Pos:24 Stuct:0x9 ( BODY FILE )
which says the word "head" was indexed in file number 1 under metaname
"swishdefault" at word position number 24 and is in the BODY of the
Then you know you can do:
swish-e -w head
swish-e -w swishdefault=(head)
and swish-e will find it.
Now, if you don't see "head" in the output you then look at why it's not
getting indexed. What I'd likely do is run without grep
swish-e -c myconfig -i body_parts.html -T indexed_words | less
and then look for words that you know are around "head" in the document
and that might give you an idea what to look for.
Maybe you have a format error in body_parts.html? Adding to your swish
might generate some warnings about the structure of your document.
Maybe "head" is in an HTML comment? Then you need to enable indexing of
Maybe the above all works find, but when spidering the file is skipped?
If that's the case then you need to figure out why. spider.pl has
debugging features to tell you why a file is skipped.
The answer is divide et impera.
Received on Sun Jun 6 09:11:53 2004