Skip to main content.
home | support | download

Back to List Archive

Re: RFC - swish interface for Perl

From: Bas Meijer <bas(at)not-real.antraciet.nl>
Date: Wed Sep 13 2000 - 08:31:12 GMT
Hi Bill,


Looking through your synopsis a thought rised: would it be possible 
to think of swish as a database, I mean could it be accessible 
through a DBI interface?
DBI is a database-independent package that provides a consistent set 
of routines regardless of what database product is in use. DBI 
separates the actual database drivers (DBDs) from the programmers 
API. Similar to what you have in mind.

We could then write code like below. Maybe it's a far-fetched idea, 
maybe it could be done...


cheers,


Bas Meijer


sub dbiswishsearch{

	use DBI;

	### Attributes to pass to DBI->connect() we check manually
	%attr = (
		PrintError => 0,
		RaiseError => 0
	);

	### The DBI:swish database handle to indexfile myindex
	my $dbh = DBI->connect( 'DBI:swish:myindex',$dbuser, 
$dbpassword, \%attr)
		or return "Can't connect to the database: $DBI::errstr";

	# start a block for the swish work
		{
		### Prepare the search query for execution yielding a 
statement-handle
		my $sth = $dbh->prepare("-w $query -m $results $tflag 
$search_tags"
		") or return "Can't prepare Swish query: $DBI::errstr";

		### Execute the statement in the database
		$sth->execute() or return "Can't execute Swish query: 
$DBI::errstr";   
		### Fetch all the data into a Perl data structure
		my $array_ref = $sth->fetchall_arrayref()
			or return "Can't fetch data: $DBI::errstr";

		### Traverse the data structure and dump data to $output
		### For each row in the returned array reference.....
		foreach my $row (@$array_ref) {
	  			### Split the row up and concatenate 
each field to $output
	  			my ($title, $filesize,$rank, $file) = @$row;
	  			$output .=  qq|<B>$rank</B><TR><TD><A 
HREF="$file">$title</A>| .
                                             qq|$filesize bytes </TD></TR>\n|;
		}
	} # end the block

	### Gracefully disconnect from swish
	$dbh->disconnect or return "Error disconnecting: $DBI::errstr";

	### void return if everything is OK
	return undef;
}


At 17:51 -0700 12-09-2000, Bill Moseley wrote:
>Hi,
>
>I'm trying to set the interface for a Perl module to access swish and would
>appreciate any comments or ideas.
>
>The motivation for this is Jose's C library.  What I'd like to do is
>develop a perl module that uses the same calls to access swish in the
>standard fork/exec method, or by changing one call, access swish via the C
>library.
>
>Maybe if someone smart writes a nice threaded swish-e server this same
>interface could be used.
>
>I haven't had a lot of time to think about it, yet.  I just invented a few
>methods that might be useful.  But now is a good time to gather suggestions
>and make changes.  Maybe this just adds an unnecessary layer of abstraction
>and confusion...
>
>
>Here's the synopsis.  More can be found at
>http://www.hank.org/modules/SWISH.html
>
>NAME
>     SWISH - Perl interface to the SWISH-E search engine.
>
>SYNOPSIS
>         use SWISH;
>         $sh = SWISH->connect('Fork',
>             prog     => '/usr/local/bin/swish-e',
>             indexes  => \@indexes,
>             results  => \&results,      # callback
>             headers  => \&headers,
>             maxhits  => 200,
>             timeout  => 20,
>         );
>
>         $sh = SWISH->connect('Library', %parameters );
>
>         $sh = SWISH->connect('Server',
>             port     => $port_number,
>             host     => $host_name,
>             %parameters,
>         );
>
>         $hits = $sh->query(
>             query       => $query_string,
>             results     => \&results,
>             headers     => \&headers,
>             properties  => 'title subject',
>             sort        => 'subject',
>             startnum    => 100,
>             maxhits     => 1000,
>         );
>
>         $error_msg = $sh->error unless $hits;
>
>         # might want to use in your headers() callback
>         $sh->abort_query;
>
>         @raw_results = $sh->raw_query( \%query_settings );
>
>         $r = $sh->index( '/path/to/config' );
>         $r = $sh->index( \%indexing_settings );
>
>         # If all config settings were stored in the index header
>         $r = $sh->reindex;
>
>         %headers = $sh->headers;
>         $stemming = $sh->headers( 'stemming applied' );
>         $last_indexed = $sh->headers( 'Indexed on' );
>
>         # returns words as swish sees them for indexing
>         $search_words = $sh->swish_words( \$doc );
>
>         $stemmed = $sh->stem_word( $word );
>
>         $sh->disconnect;
>         # or an alias:
>         $sh->close;
>
>DESCRIPTION
>     This module provides a standard interface to the SWISH-E search engine.
>     With this interface, your program can use SWISH-E in the standard
>     forking/exec method, or with the SWISH-E C library routines, and, if
>     ever developed, the SWISH-E server with only a small change.
>
>     The idea is that you can change the way your program access SWISH-E
>     without having to change your code.
>
>
>
>Bill Moseley
>mailto:moseley@hank.org
Received on Wed Sep 13 08:31:23 2000