Skip to main content.
home | support | download

Back to List Archive

Re: Q - mysql and mailing-lists

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Jun 30 2003 - 04:07:31 GMT
On Thu, Jun 26, 2003 at 06:43:40PM -0700, Tim Freedom wrote:
> Hello all, have a couple of newbie questions (we just migrated from
> htdig and are trying to bring up swish-e),
> 
>  1. We'd like to have two index/searchable options,
> 
>     a. One of the website (most of which is in mysql which we access
>        via php) - not sure how that should be done and there are sections
>        within the mysql that are private as well (account info, passwords,
>        etc which need to be excluded)

You can either spider those pages or write a program to extract out the 
data from the MySQL tables.  There's a very basic example in the 
distribution that might give you an some ideas: prog-bin/MySQL.pl.

>     b. One of our mailing-lists - these are plain html files and should be
>        easy to deal with :-)

There's also another example in that same "prog-bin" directory called 
"index_hypermail.pl" which just indexes the pages created by hypermail.  


>  2. We'd like to search both together or separately, so I gather that's a
>     matter of coding up a CGI so that the appropriate "search-index" files
>     are included/excluded, right ?

Yes, you can pass multiple index files names to swish and swish will 
search both and combine results.

> 
>  3. Is there anyway to get access to the actual source code that swish-e
>     site itself uses to search the mailing-list discussions.  We'd love
>     to have the same options esp. those listed on the following page,
> 
>       http://swish-e.org/Discussion/search/swish.cgi

That's just an older version of the swish.cgi script (and the 
index_hypermail.pl program) provided in the swish-e distribution.  So, 
yes, it's available.

>     Note: we love the idea of offering a date range as well and are not
>           quite sure what meta-tags are needed to pull that off.

Take a look at the index_hypermail.pl script -- it just pulls out a date 
from the file.  You could use that method or use the last modified date 
of the file itself.

>     We run mailman/mhonarc for our mailing-lists and archiving (if that
>     matters).

Yes, in that you will have to look at what mhonarc generates and how to 
parse out the data you need.  Perhaps if you get something working you 
could contribute it back for inclusion in the swish-e distribution.

> 
> BTW: this is on debian linux (sarge - testing release), running swish-e-2.2.3

Swish-e now includes a debian directory for building a .deb from Swish-e
source if you need newer features or bug fixes that are not available
from debian.org's package repository.  That will keep you in the Debian
package system.  Or you can always build from source and install in
/usr/local.


-- 
Bill Moseley
moseley@hank.org
Received on Mon Jun 30 04:07:34 2003