Skip to main content.
home | support | download

Back to List Archive

Re: Help Needed

From: Peter Karman <karman(at)not-real.cray.com>
Date: Tue Mar 23 2004 - 14:46:18 GMT
If I'm reading this right, your Author tagset is inside a comment <!-- 
-->. Thus swish-e skips it. Check the IndexComments config option.


Second, I think that using the HTML parser will not recognize the Author 
tag the way it is in your doc. I believe (please, someone, correct me on 
this) that the HTML parser will recognize only <meta> tags as valid for 
metaname storage. So you could use some kind of filter prior to indexing 
to add the Author as a valid <meta> tag. That should cure both issues, 
since the new <meta> tag would not be in a comment.

pek


Jignesh Jani supposedly wrote on 03/23/2004 07:19 AM:
> Hi,
>         Straight to the point,
> 
>         I am using swish-e to index documents which i have converted to =
> html. I am using a configuration file which contains directives as =
> below:
> 
> IndexOnly .htm .html
> MinWordLimit 3
> MaxWordLimit 50
> IndexComments yes
> IgnoreWords File: 'D:/english.txt'
> BeginCharacters abcdefghijklmnopqrstuvwxyz
> WordCharacters =
> abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
> MetaNames Author
> 
> 
> One of the file contains data like this,
> 
> 
> <head>
> <meta name=3D"Excel Workbook Frameset">
> <meta http-equiv=3DContent-Type content=3D"text/html; =
> charset=3Dwindows-1252">
> <meta name=3DProgId content=3DExcel.Sheet>
> <meta name=3DGenerator content=3D"Microsoft Excel 9">
> <link rel=3DFile-List =
> href=3D"./CMS%20Feature&amp;Decision%20Matrix_files/filelist.xml">
> <link rel=3DEdit-Time-Data
> href=3D"./CMS%20Feature&amp;Decision%20Matrix_files/editdata.mso">
> <link rel=3DOLE-Object-Data
> href=3D"./CMS%20Feature&amp;Decision%20Matrix_files/oledata.mso">
> <!--[if gte mso 9]><xml>
>  <o:DocumentProperties>
>   <o:Author>bhavin.modi</o:Author>
>   <o:LastAuthor>ILink</o:LastAuthor>
>   <o:Created>2003-02-18T06:49:50Z</o:Created>
>   <o:LastSaved>2004-03-22T05:41:44Z</o:LastSaved>
>   <o:Company>I-link Infosoft</o:Company>
>   <o:Version>9.2720</o:Version>
>  </o:DocumentProperties>
>  <o:OfficeDocumentSettings>
>   <o:DownloadComponents/>
>   <o:LocationOfComponents =
> HRef=3D"file:\\Win-dbsrv\Softwares\MS%20Office\Office%202K\msowc.cab"/>
>  </o:OfficeDocumentSettings>
> </xml><![endif]--><![if !supportTabStrip]>
> <link id=3D"shLink" =
> href=3D"./CMS%20Feature&amp;Decision%20Matrix_files/sheet001.htm">
> <link id=3D"shLink" =
> href=3D"./CMS%20Feature&amp;Decision%20Matrix_files/sheet002.htm">
> 
> <link id=3D"shLink">
> 
> <script language=3D"JavaScript">
> 
> 
> i want author to be indexed as separate metaname and then want to =
> perform search using author=3D serchword. problem is the content against =
> author is indexed under swishdefult.
> 
> 
> Can you please tell me is this possible. If yes then what do you think i =
> am doing wrong.
> 
> 
> Jignesh Jani
>  (India)
> 
> 
> *********************************************************************
> Due to deletion of content types excluded from this list by policy,
> this multipart message was reduced to a single part, and from there
> to a plain text message.
> *********************************************************************

-- 
Peter Karman - Software Publications Programmer - Cray Inc
phone: 651-605-9009 - mailto:karman@cray.com
Received on Tue Mar 23 06:46:19 2004