Skip to main content.
home | support | download

Back to List Archive

Autoswish Indexing & Search scripts

From: Peter Lord <plord(at)not-real.chariot.net.au>
Date: Mon Aug 20 2001 - 10:51:34 GMT
Hello all,

I've got autoswish going in a very rudimentarity manner but I've a question for the group regarding alteration of the format of search results, and problem around the indexing /searching controlled by autoswish.pl & Swish-Search.cgi.  I would be grateful for your suggestions & comments.

Environment: Linux RedHat 6.1, Swish-e-2.0.5, AutoSwish 1.0 

1. Autoswish seems to have a curious idiosyncracy: if I submit the complete.shtml indexing form (this comes with the Autoswish 1.0 distribution), and fail to put an entry in the META DATA= field (Field 3.1 of complete.shtml), the site will not index.  Autoswish creates the .conf file and then when the "INDEX" button is clicked, autoswish.pl reports that the site has been successfully indexed, but infact fails to create the .swish index file.  Adding an entry to the META NAME = field (Field 3.2) is no help unless the META DATA = field also has an entry.  It doesn't seem to matter whether the entry in Field 3.1 is an actual META NAME / META DATA entry that I have in my files, or whether the entry is a word simply of randow letter, it just seems to require any old entry there to enable indexing.  What's going on here and how can it be corrected in the scripts so that I just have to put desired META NAME entries in Field 3.2?

2) How can Swish-Search.cgi (or AutoSwish.pl?) be modified so that I can search (and/or index?) ONLY for words contained under designated PropertyName(s) (ie META NAME=""CONTENT="") tag(s) of .html files ? Currently, when I search for a word that is contained in the CONTENT field of specified META NAME entries, it fails to find documents that contain the word at that location,  unless those documents also contain the word in non-specified META NAME entries, or in the body of the file.

I will illustrate this problem by way of example: If my .html documents have "META NAME=Administration CONTENT=paperwork" in their headers, and I enter the value "Administration" into the "META DATA =""" field (Field 3.1 on the autoswish indexing form "complete.shtml") or in the "META NAME =""" field (Field 3.2), a subsequent search of the indexed site for the word "paperwork" (using Swish-Search.cgi) actually ignores files that contain this word within the  CONTENT="" fields of the specified META NAME / META DATA tags; however it will find .html files that contain the word "paperwork" either within the body of the .html document, or within other META NAME = fields that weren't specified in the indexing form.   More surprisingly though, documents containing the word "paperwork" within the CONTENT fields of META DATA or META NAME tags that weren't nominated in the indexing form complete.shtml, are identified at search.  Can anybody tell me what is going on here please and how I can modify the script(s) so that swish will just index stuff in the META Fields that are specified?

3) Does anybody know how Swish-Search.cgi (distributed with AutoSwish 1.0) can be modified to substitute a hyperlinked .html document <TITLE> tag for the hyperlinked path & file name that it currently presents, as well as not presenting the Score, or Summary.

To refresh the group, the standard reporting format of Autoswish.pl /  Swish-Search.cgi (from Autoswish 1.0) is:

http://localhost.localdomain/demo/ype/my.html

untitled Score: 1000 [ 1701 bytes]

Summary: untitled documents, research, management....


Cheers,

Andrew L.
Received on Mon Aug 20 10:52:11 2001