Skip to main content.
home | support | download

Back to List Archive

Re: Q: Swish-E foreign language character support

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Feb 05 2001 - 23:36:28 GMT
At 02:34 PM 02/05/01 -0800, Kati Gäbler wrote:
>Thanks for the advise on the last security hole on that script,
fortunately I 
>couldn't even make the -d option, or the modified script work using the 
>cut-n-paste method! I think I will advise my hosting provider to offer a 
>safer script.

Can you try:

   ./swish-e -d :: -w foo



><meta name="department" content="first_floor">

You could index all and then search like this:

  swish-e -w $query department=(first_floor)

>Or it could just be a keyword or phrase contained somewhere in the files. 
>Would it be possible for Swish to index ONLY those FILES where the spider 
>finds the keyword? or the other way, those where keyword does NOT exist.

Sure, the swishspider program is not very complicated.  It would be easy to
search for some keyword and reject docs that way.

Is there any way you can use the file method for indexing?  Or is your web
dynamically generated in a way that you can't use the file system.


>Another thing that might be useful would be if the spider could recognize
and 
>ignore any frameset files, or reverse, only to index framesets, as the 
>administrator likes it.

Oh, frames.  Search the SWISH-E list archive for discussion of frames.  I
stay away from frames.

>Another thing that might be useful would be to index or not index certain 
>files containing specific characters in the filename only, (not just the 
>suffix), anything after the last "/" of the URL. It could for example be any 
>upper cases as defined [ABCDEFG...] etc., or whatever else specified by the 
>administrator, in a Swish config option.

Does http://sunsite.berkeley.edu:4444/SWISH-CONFIG.html#item_FileRules help?

>Lastly, I'm still searching for simple front-end.
.

Someday I'll get something together.  I kind of believe, though that if
someone is using CGI scripts in Perl or C or whatever that they should be
somewhat skilled -- it's asking for trouble otherwise as you can open up
security holes and leave your server open to attacks.  Just my opinion.

>If there simpler 
>and more portable front-end examples available to choose from, only needing 
>perl 5, not requiring installation of various non-standard modules or other 
>libraries that doesn't exist in the regular hosting situation my guess is 
>that the Swish number of installations would be a hundred times more 
>successful! 

One comment: You *do* want to install non-standard modules whenever you
can.  Why not use someone else's experience to your benefit?  Especially if
you don't feel like an expert yourself.

Ok, here's my ten minutes of input for a general script.  I'll leave the
page forward and page backward code as an exercise to someone else.  This
compiles but I didn't test it.

The SWISH (and SWISH::Fork) modules on at http://hank.org/modules/.
HTML::Template is on CPAN, and CGI is standard.

** UNTESTED ** 
(I didn't look at any documentation, so there may be errors)

#!/usr/local/bin/perl -w
use strict;

use SWISH;
use CGI;
use HTML::Template;

my $swish_binary = '/usr/local/bin/swish-e';
my $swish_index  = 'index.swish-e';

    my $q = CGI->new;

    show_template( 'front.page', {} )
        unless $q->param('query');

    my $start = $q->param('start') || 0;

    my @results;

    my $sh = SWISH->connect('Fork',
       prog     => $swish_binary,
       indexes  => [$swish_index],
       startnum => $start,
       maxhits  => 20,
       timeout  => 10,
       results  => sub {
            push @results, {
                FILE   => $_[1]->swishdocpath,
                TITLE  => $_[1]->swishtitle,
            } },
    );

    show_template( 'error.page', { MESSAGE => $SWISH::errstr } )
        unless $sh;
            

    my $hits = $sh->query( $q->param('query') );

    show_template( 'error.page', { MESSAGE => $sh->errstr } )
        unless $hits;


    show_template( 'results.page', {
        RESULTS => \@results,
        QUERY   => $q->param('query'),
        HITS    => $hits,
        NEXT    => $start + 20,
        PREV    => $start - 20 < 0 ? 0 : $start - 20,
    } );
        
        
    

sub show_template {
    my ( $file, $params ) = @_;
    
    my $template = HTML::Template->new(
        filename => $file,
    );

    $template->param( $params );

    print $q->header('text/html'),
          $template->output;
    exit;
}





Bill Moseley
mailto:moseley@hank.org
Received on Mon Feb 5 23:40:04 2001