Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Who uses Swish-e & a question

From: at <Peter>
Date: Thu, 06 Oct 2011 08:58:32 -0500
Fran├žois Tissandier wrote on 10/06/2011 06:58 AM:
> 
>> Get them from your web server's log files. Or (better) add some code to
>> your search script to log the search terms separately. We do this to let
>> us add a search cloud pop-up of the top ten most recently-searched terms.
>>
> 
> Mmm good idea, but that's not what I want to do. I want to propose the
> most popular keywords from my content, not from the searches ! And those
> keywords are in the index, so I thought there is maybe a way to extract
> the most popular ones. By "popular" I mean "keywords appearing the most
> often". Sorry if my frenglish is not clear !

you probably want:

http://svn.swish-e.org/libswish3/trunk/perl/countwords.pl

per this email thread:

http://swish-e.org/archive/2005-02/9033.html

Note though, that you should really not equate "popularity" with
"frequency" -- especially if you are not using StopWords -- because e.g.
the frequency of the word 'the' will skew your definition of "popular".

I don't use a StopWords list because my use cases demand precision. If
it were me, I would research the actual frequency of words in my
collection using the countwords.pl script, and then identify a "sweet
spot" range of frequency that ignores what would otherwise be StopWords.

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users(at)not-real.lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Oct 06 2011 - 13:58:35 GMT