Skip to main content.
home | support | download

Back to List Archive

Installing Swish-e, Apache and mod_perl for Windows 2000/XP

From: Peter Morling <pmorling(at)not-real.nat.sdu.dk>
Date: Wed May 12 2004 - 10:26:51 GMT
Hi,

want to share my notes, hope they will help other users of this great
search-engine, result of this setup is found at
http://search.statmaster.sdu.dk and thanks for Your (must be Bill :-) )
help!!

Best,
Peter

-------------------------------------------------------------------------

INSTALLING SWISH-E, APACHE AND MOD_PERL FOR WINDOWS 2000


1. PERL

- Must be installed before Swish-e, since the install manager of Swish-e
will
install some perl modules according to the current Perl version.

- Install into 'C:\Perl'


2. SWISH-E

- Install into 'C:\SWISH-E'

- To create the DB, use the Web-spider. This is run from the command line.
The textfile 'swish.cfg' defines how the DB should be created.

- Create the Dir 'C:\SWISH-E\web_index' and put the following text file and
name it swish.cfg


############################################################################
#
# run this cfg with: "swish-e -S http -c swish.cfg"
# to see what metanames your index are using: swish-e -f index.swish-e -T
INDEX_METANAMES

IndexFile myindex.tmp
IndexName myindex.tmp
IndexDir http://www.statmaster.sdu.dk
IndexOnly .html .htm
IndexReport 1
Delay 1
IndexContents HTML* .html .htm
StoreDescription HTML* <body> 10000

# create metaname for the docpath to search using 'select_by_meta'

MetaNames course
ExtractPath course regex !^.*/courses/([^/]+)/.*$!$1!
ExtractPathDefault course other

# create a metaname for headings to search in

MetaNames headings
MetaNameAlias headings h1 h2
MetaNames title description swishdocpath
PropertyNames title headings
MetaNamesRank 10 title
MetaNamesRank 8 headings
MetaNamesRank -5 wrongwords
############################################################################
#


3. WINDOWS 2000 SCHEDULER SERVICE

- To create a daily update of the DB use the Windows 2000 Scheduler service.
It is recommended to use the graphical version of the Scheduler found at
Start->Settings->Control Panel->Scheduled Tasks

- Put the following text file into the 'C:\SWISH-E\web_index' dir and name
it
'updatedb.cmd' then schedule it!


############################################################################
#
REM the cp and mv commands requires cygwin to be installed
REM run the swish-e spider using the config file
swish-e -S http -c swish.cfg

REM move current db to old
cp -f myindex myindex.old
cp -f myindex.prop myindex.old.prop

REM move newly created to current
mv -f myindex.tmp myindex
mv -f myindex.tmp.prop myindex.prop
############################################################################
#


4. TEMPLATE-TOOLKIT

- It is recommended to use the Template-Toolkit to define the HTML-output
created by 'swish.cgi'. Obtain it from http://www.template-toolkit.org/ and
use the Perl package manager (ppm) to install it.

- Edit the file 'search.tt' located in
'C:\SWISH-E\share\doc\swish-e\example'
to define your HTML-output. The SWISH-E config file should point out where
it
is  located!

- here is my sample of 'search.tt' the result is seen at
http://search.statmaster.sdu.dk


############################################################################
#
[% WRAPPER page %]
    [% PROCESS swish_header %]
    [% title = PROCESS title %]
    [% IF ! search.results %]
        [% PROCESS search_form %]
        [% PROCESS show_message %]
 [% PROCESS swish_footer %]
    [% ELSE %]
        [% PROCESS search_form %]
        [% PROCESS nav_bar %]
        [% PROCESS nav_bar_pages %]
        [% PROCESS results_list %]
        [% PROCESS nav_bar_pages %]
        [% PROCESS swish_footer %]
    [% END %]
[% END %]


[% # This is just an example -- you would want your own "page" to wrap
around "swish" %]
[% BLOCK page %]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<body bgcolor=white alink=red link=blue vlink=purple onload="if
(document.forms[0]) {document.forms[0].elements[0].focus();}">
<title>
   [% title %]
</title>

<style>
BODY { background: white fixed no-repeat left top;
 font-family:  Helvetica, Arial, sans-serif;
 margin: 0;
 color: black;
 padding : 0px;
 margin-left:0px;
 margin-right:0px;
 margin-top:0px;
 margin-bottom:0px;
}
A:link {
  font-family:  Helvetica, Arial, sans-serif;
  font-size: 100%;
  text-decoration: none;
  background: transparent;
 color:#00319c;
  }
A:visited {
  font-family:  Helvetica, Arial, sans-serif;
  font-size: 100%;
  text-decoration: none;
  background: transparent;
  color:#cc3399;
  }
A:link:active {
  font-family:  Helvetica, Arial, sans-serif;
  font-size: 100%;
  text-decoration: none;
  color:#00319c;
  }
A:link:hover {
  font-family:  Helvetica, Arial, sans-serif;
  text-decoration: underline;
  color: #00319c;
  }
.smalllink {
  color: #00319c;
  text-decoration: underline;
  font-family:  Helvetica, Arial, sans-serif;
  font-size: 10pt;
  }
.smalltext {
  font-family:  Helvetica, Arial, sans-serif;
  font-size: 10pt;
  }
.tinylink {
  color: #00319c;
  text-decoration: underline;
  font-family:  Helvetica, Arial, sans-serif;
  font-size: 10px;
  }
.tinytext {
  font-family:  Helvetica, Arial, sans-serif;
  font-size: 10px;
  }
body,td,a,p,.h{font-family:Helvetica, Arial, sans-serif;}
.h{font-size: 20px;}
.q{color:#0000cc;}
</style>

<body>
   [% content %]
</body>
</html>
[% END %]


[% BLOCK title %]
    [% IF ! search.results %]
 [% IF ! search.query_simple %]
  Search the Master of Applied Statistics Web Pages
 [% ELSE %]
  Search: [% search.query_simple | html %]
 [% END %]
    [% ELSE %]
         Search: [% search.query_simple | html %]
    [% END %]
[% END %]


[% BLOCK swish_header %]
<!-- Start of topmenu -->
<table width="100%" height="30" cellspacing=0 cellpadding=0 bgcolor=#003399>
<tr>
<td >
&nbsp;
</td>
</tr>
</table>
<!-- End of topmenu -->
<br>
<center>
<table border="0" cellpadding="0" cellspacing="0">
<tr>
<td>
<h1>Search the Master of Applied Statistics Web Pages</h1>
</td>
</tr>
</table>
</center>
<br>
[% END %]


[% BLOCK swish_footer %]
</td>
<td width="8">&nbsp;&nbsp;&nbsp;</td>
</tr>
</table>
<!-- End of search page -->
<BR><BR>
<!-- Start of bottom line -->
<TABLE WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0">
<TR>
<td width="8">&nbsp;&nbsp;&nbsp;</td>
<TD colspan="2">
<HR size="1">
<a href=" http://statmaster.sdu.dk"><span
class="smalllink">HOME</span></a><span class="smalltext"> | </span><a
href="javascript:self.history.go(-1)"><span
class="smalllink">Back</span></a>
<P>
<span class="tinytext">Last modified May 6, 2004, <a
href=""><u>Webmaster</u></a></span>
</TD>
<td width="8">&nbsp;&nbsp;&nbsp;</td>
</TR>
</TABLE>
<!-- End of bottom line -->
[% END %]


[% BLOCK show_message %]
    [% IF search.errstr %]
        Your search - <b>[% search.query_simple | html %]</b> - did not
match any documents.<br>
        No pages were found containing <b>&quot;[% search.query_simple |
html %]&quot;</b>.
    [% END %]
[% END %]


[% BLOCK search_form %]
<center>
<table cellspacing="1" border="0" cellpadding="0">

<tr>
<td valign=top colspan=4>
    [% CGI.start_form( '-action' => CGI.script_name, '-method' => 'GET' ) %]
        [% CGI.textfield( {
            name    => 'query',
            size    => 40,
            maxlength => 200,
            } ) %]

        [% CGI.submit('submit',' Search ') %]
</td>
</tr>

<tr>
<td valign=top>
Limit search to:
</td>
<td valign=top colspan=3 >
[% search.get_meta_name_limits %]
</td>
</tr>

<tr>
<td valign=top>
Sort by:
</td>
<td colspan=4 valign=top>
[% search.get_sort_select_list %]
</td>
</tr>

<tr>
<tr>
<td valign=top>
Within:
</td>
<td colspan=4 valign=top>
[% search.get_limit_select %]
[% CGI.end_form.join('') %]
</td>
</tr>

<tr>
<td colspan=4 valign=top>
<br>
<center>
<a href="/docs/help"><font size="-1"><u>Search Help</u></font></a><font
size="-1">&nbsp;-&nbsp;</font><a
href="http://search.statmaster.sdu.dk"><font size="-1"><u>Search
Home</u></font></a><font size="-1">&nbsp;-&nbsp;</font><a
href="/docs/about"><font size="-1"><u>About</u></font></a>
</center>
</td>
</tr>

</table>
</center>
<!-- Start of search page -->
<TABLE WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0">
<TR>
<td width="8">&nbsp;&nbsp;&nbsp;</td>
<TD colspan="2">
<BR><BR>

[% END %]


[% BLOCK nav_bar %]
    [% search.stopwords_removed %]
    <table cellpadding=0 cellspacing=0 border=0 width="100%">
        <tr>
            <td height=20 >
                <b>[% search.navigation('from') %] to [%
search.navigation('to') %] of [% search.navigation('hits') %] matches on
search for &quot;[% search.query_simple | html %]&quot;</b>
            </td>
            <td align=right >
                <font size="-1" color="#ffffff" face="Geneva, Arial,
Helvetica, San-Serif">
  </font>
            </td>
        </tr>



    </table>
[% END %]


[% BLOCK nav_bar_pages %]

    [% IF search.navigation('hits') > 15 %]
     <b>Result pages:&nbsp;&nbsp;</b>
    [% END %]

    [% IF search.navigation('prev_count') %]
     <font size="-1" face="Arial, Helvetica, San-Serif">
        <a style="text-decoration:none" href="[% search.query_href
%]&amp;start=[% search.navigation('prev') %]"><u>[<b>&lt;&lt;
Prev</b>]</u></a>&nbsp;
        </font>
    [% END %]


    [% FOR page = search.navigation('page_array') %]
        [% IF page.cur_page %]
            <font size="-1" face="Arial, Helvetica, San-Serif">
            <b>[% page.page_number %]</b>&nbsp;
            </font>
        [% ELSE %]
            <font size="-1" face="Arial, Helvetica, San-Serif">
            <a style="text-decoration:none" href="[% search.query_href
%];start=[% page.page_start %]"><u>[% page.page_number %]</u></a>&nbsp;
            </font>
        [% END %]
    [% END %]

    [% IF search.navigation('next_count') %]
     <font size="-1" face="Arial, Helvetica, San-Serif">
        <a style="text-decoration:none" href="[% search.query_href
%]&amp;start=[% search.navigation('next') %]"><u>[<b>Next
&gt;&gt;</b>]</u></a>
        </font>
    [% END %]

[% END %]


[% BLOCK results_list %]
    [% FOREACH item = search.results %]

        <dl>
            <dt>
             <font face="Arial, San-Serif">
                <a href="[% item.swishdocpath_href %]"><u>
                [% ( item.swishtitle || item.swishdocpath )  %]
                </u></a>
                </font>
            </dt>
            <dt>
             <font size="-1" face="Arial, San-Serif" >
                [% item.swishdescription %]
                </font>
                <br>
                <font size="-1" face="Arial, San-Serif" color=#008000>
                <i>
                    [% item.swishdocpath %]
                    - [% item.swishdocsize div 1000 %]k
                </i>
                </font>
            </dt>
        </dl>
        </font>
    [% END %]
[% END %]
############################################################################
#


5. SWISH-E CONFIG FILE '.searchcgi.conf'

- in the 'swish.cgi' tell to load the default config file, use the full path
important!

- this line: my $DEFAULT_CONFIG_FILE = 'C:\FULL-PATH\.searchcgi.conf';

############################################################################
#
use lib 'C:\SWISH-E\lib\swish-e\perl';

return {
   title           => 'Search',
   swish_index     => 'C:\SWISH-E\web_index\myindex',

   template => {
    package         => 'SWISH::TemplateToolkit',
    file            => 'search.tt',
    options         => {
 INCLUDE_PATH    => 'C:\SWISH-E\share\doc\swish-e\example',
    },
   },
};
############################################################################
#





6. SWISH.CGI

- the following is edited in the user-config-section of 'swish.cgi'

- The HTTP interface to the DB is run through 'swish.cgi'

- remember full path to your DEFAULT_CONFIG_FILE

- comment out # use CGI ();

- must have full path to your DB
swish_index     => 'C:\SWISH-E\web_index\myindex',
for some reason this is not loaded from the config file??

- edit
sorts           => [qw/swishrank swishlastmodified swishtitle headings/],

if you want another sort order, note that these must be defined as metatags
when creating the DB using swish.cfg and the web-spider

- edit
metanames       => [qw/ swishdefault title headings /],
if you want another 'limit search to' order, note that these must be defined
as metatags
when creating the DB using swish.cfg and the web-spider

- edit: name_labels => {

            headings  => 'Headings',
            title  => 'Title',

these will be shown as the labels in the 'limit search to'


- edit: select_by_meta  => {

the value:   metaname    => 'course',

must match the metaname defined in 'swish.cfg' when creating the DB, e.g.

---
MetaNames course
ExtractPath course regex !^.*/courses/([^/]+)/.*$!$1!
ExtractPathDefault course other
---

- the select_by_meta makes you define to seach parts of the www-path that
you have spidered!!!

     values      => [qw/PATH1 PATH2 ... PATHN/],

- the values here must be values on the path following the regular
expression of 'course'!!!


- remember to set 'use_library => 1' to use the SWISH::API


- locate the perl-function 'sub handler {' (the mod_perl entry) and change
the following:

            #return Apache::Constants::OK();
            return Apache::OK;




7. EDIT TEMPLATETOOLKIT.PM

- if you are using the select_by_meta you should edit the following in
TemplateToolkit.pm, where you located your swish-perl-modules, e.g.
'C:\SWISH-E\lib\swish-e\perl\SWISH'

- the problem is when using the 'popup_menu' in select_by_meta your'll need
a default value to specify to search in the whole DB.

- edit the following function:


############################################################################
#
sub get_limit_select {
    my ( $results ) = @_;
    my $q = $results->CGI;


    my $limit = $results->config('select_by_meta');
    return '' unless ref $limit eq 'HASH';

    my $method = $limit->{method} || 'checkbox_group';


    my $labels = $limit->{labels} || {};        # new
    $labels->{''} = 'All';   # new


    my @options = (
        -name   => 'sbm',
        -values => [ '', @{$limit->{values}}],  # new
 -labels => $labels,     # new
    #    -name   => 'sbm',
    #    -values => $limit->{values},
    #    -labels => $limit->{labels} || {},
    );

    push @options, ( -columns=> $limit->{columns} ) if $limit->{columns};


    return join "\n",
        #'<br>',
        #( $limit->{description} || 'Select: '),
        $q->$method( @options );
}
############################################################################
#



8. APACHE

- Obtain the win32 webserver from http://httpd.apache.org/

- Install into 'C:\Apache2'



9. MOD_PERL

- mod_perl is a module for Apache that let's you load perl persistently and
perlscripts

- Obtain it from http://perl.apache.org/ and follow the instructions for
win32
using the Perl package manager 'ppm'.


10. EDIT HTTPD.CONF

- locate the text file httpd.conf in '\Apache2\conf' and add the following:


############################################################################
#
LoadFile "C:/Perl/bin/perl58.dll"
LoadModule perl_module modules/mod_perl.so
PerlRequire "C:/Apache2/conf/extra.pl"

<Perl>
    use lib "C:/Apache2/cgi-bin";
    use lib "C:/SWISH-E/lib/swish-e/perl";
    require "C:/FULL-PATH/swish.cgi";
</Perl>

Alias /cgi-bin/ "C:/Apache2/cgi-bin/"
<Location /cgi-bin/>
    PerlSetVar Swish_Conf_File "C:/FULL-PATH/.searchcgi.conf"
    allow from all
    SetHandler perl-script
    PerlHandler SwishSearch
</Location>
############################################################################
#

- this makes Apache load Perl and swish.cgi persistently running SwishSearch
from location '/cgi-bin/'

- changes in httpd.conf, swish.cgi, extra.pl, .swishcgi.conf etc. requires
a restart of Apache.

- check \Apache2\logs\error.log for errors at every restart

- the text file 'extra.pl' as the following


############################################################################
#
use Apache2 ();
use ModPerl::Util ();
use Apache::RequestRec ();
use Apache::RequestIO ();
use Apache::RequestUtil ();
use Apache::Server ();
use Apache::ServerUtil ();
use Apache::Connection ();
use Apache::Log ();
use Apache::Const -compile => ':common';
use APR::Const -compile => ':common';
use APR::Table ();
use Apache::compat ();
use ModPerl::Registry ();
use CGI ();
1;

############################################################################
#
Received on Wed May 12 03:26:52 2004