Skip to main content.
home | support | download

Back to List Archive

Re: problems with highlighting and output

From: <psydok(at)not-real.sulb.uni-saarland.de>
Date: Fri Aug 26 2005 - 06:59:46 GMT
Hi Bill, hi Peter,

thanks for your support.

@peter: I tried all of the highlighting modules, not one worked.

> > If the terms were found within the first lines and the are displayed, they
> > are not highlighted. Nevertheless I increased the StoreDescription
> > parameter to 999999999, but it did not fix the problem.
>
>Are most of your files larger than that?

some of them are larger, some are not.

>You can assume that others are using the same code and would have
>reported this as a problem by now.

That's absolutely true. So here are is some code:

First: the configuration for indexing:
### start indexing config
IndexOnly .htm .html .pdf .ps .txt .xml .ppt .pps .doc .rtf .gif. .jpg 
jpeg .xbm .au .mov .mpg .avi

IndexFile /u/swish-e/index.swish-e
IndexDir /u/test/htdocs/test/
IndexContents HTML .htm .html .pdf .ppt .txt .ps .xml .pps .doc .rtf
NoContents .gif .jpg .jpeg .xbm .au .mov .mpg .avi
FollowSymLinks yes
obeyRobotsNoIndex no
ConvertHTMLEntities YES
MetaNames swishtitle swishdescription swishdocpath DC.Title 
DC.title.translated Title Titel Author DC.Creator DC.Creator.PersonalName 
DC.Creator.CorporateName DC.Description description contributor 
DC.Contributor.CorporateName DC.Contributor.PersonalName DC.Subject Subject 
Keywords DC.Language publisher dc.publisher dc.publisher.corporatename 
dc.publisher.personalname
MetaNamesRank 10 body
MetaNamesRank 9 swishdescription
MetaNamesRank 8 Title
MetaNamesRank 7 swishtitle
MetaNamesRank 6 DC.Title
MetaNamesRank 5 author
MetaNamesRank 4 subject

PropertyNames DC.title swishdescription author contributor subject 
DC.Language publisher
PropertyNameAlias DC.Title title Titel
PropertyNameAlias swishtitle Dc.Title.translated
PropertyNameAlias author DC.Creator DC.Creator.PersonalName 
DC.Creator.CorporateName
PropertyNameAlias swishdescription body DC.Description description
PropertyNameAlias contributor DC.Contributor.CorporateName 
DC.Contributor.PersonalName
PropertyNameAlias subject DC.Subject Keywords
PropertyNameAlias publisher dc.publisher dc.publisher.corporatename 
dc.publisher.personalname

UndefinedMetaTags INDEX
WordCharacters abcdefghijklmnopqrstuvwxyz0123456789.-
IgnoreFirstChar .-
IgnoreLastChar  .-
BeginCharacters abcdefghijklmnopqrstuvwxyz0123456789
EndCharacters   abcdefghijklmnopqrstuvwxyz0123456789
IndexReport 3
IgnoreWords File: /home/swish-e/stopwords.txt
IgnoreLimit 90 4
IgnoreNumberChars 0123456789.,;/&%§$
IndexComments no
TranslateCharacters :ascii7:
BumpPositionCounterCharacters |.
FileRules dirname contains incoming original
FileRules filename contains robots. incoming new.txt
FileFilter .pdf /usr/local/share/doc/swish-e/examples/filter-bin/_pdf2html.pl
FileFilter .doc /usr/local/bin/catdoc
FileFilter .rtf /usr/local/bin/catdoc
FileFilter .ppt /usr/local/bin/catppt
FileFilter .pps /usr/local/bin/catppt

StoreDescription HTML <body> 999999999

### end indexing config

And here comes the configuration for the swish.cgi:
### start swish.cgi config
return {
     swish_binary    => '/usr/local/bin/swish-e',
     swish_index     => '/u/swish-e/index.swish-e',
     title => 'Volltextsuche in den Dokumenten',
     title_property => 'Dc.Title',
     description_property => swishdescription,



   display_props   => [qw/Author DC.Title swishtitle DC.Language Subject 
swishlastmodified swishdocsize swishdocpath /],
         name_labels => {
             swishdefault        => 'Alle Elemente durchsuchen',
             Author              => 'Autor, beteiligte Person/Einrichtung',
             'DC.Title'          => 'Titel',
             swishtitle        => 'Alternativer Titel',
             swishrank           => 'Rank',
             'DC.Language'         => 'Sprache',
             swishlastmodified   => 'Datum der letzten Änderung',
             swishdocsize        => 'Größe des Dokuments',
             swishdocpath        => 'URL',
         },
     date_ranges => 0,
     page_size       => 10,
     sorts => [qw/swishrank Title swishlastmodified/],
     secondary_sort  => [qw/swishlastmodified desc/],
     template => {
     package     => 'SWISH::TemplateDefault',
                 },
     timeout         => 10,
     max_query_length => 400,
     max_chars       => 500,

       highlight       => {
              package         => 'SWISH::DefaultHighlight',
              show_words      => 10,
              max_words       => 100,
              occurrences     => 6,
              highlight_on   => '<b>',
              highlight_off  => '</b>',
                 },
}
### end swish.cgi config

thanks again for your support

Herb
Received on Thu Aug 25 23:59:50 2005