Skip to main content.
home | support | download

Back to List Archive

I'm getting there!

From: Rich Thomas <thomasr(at)not-real.buffalo.edu>
Date: Fri Jan 25 2002 - 20:39:37 GMT
Ok!!!  I was able to index my /apache/htdocs files. And see them with the
cgi script!!  Yippee.
Ok, so it's Friday afternoon and I'm supposed to demo Swish on Monday but I
can still take joy in small triumphs


Only....  Now I can't get spidering to work again..grrrrr

I copied the swishspider executable to the directory I have all my swish
stuff in.  And it claims to be working...but never does any indexing.

Here's my directory , swish.conf and command line
/usr/local/SWISH

drwxr-xr-x   3 root     other        512 Jan 25 15:30 .
drwxr-xr-x  19 root     other        512 Jan 25 15:22 ..
-rw-r--r--   1 root     other          0 Jan 25 15:28
index.swish-e.prop.temp
-rw-r--r--   1 root     other     393216 Jan 25 15:28 index.swish-e.temp
drwxr-xr-x   2 root     other        512 Jan 25 15:22 modules
-rwxr-xr-x   1 root     other    1885300 Jan 25 15:22 swish-e
-rwxr-xr-x   1 root     other      59055 Jan 25 15:22 swish.cgi
-rw-r--r--   1 root     other        144 Jan 25 15:22 swish.conf
-rwxr-xr-x   1 root     other       2054 Jan 25 15:27 swishspider

swish.conf

IndexDir http://ublin.lib.buffalo.edu/webcat/bibcat/A/A/E/9/001.html
IndexOnly .html
StoreDescription HTML2 <body> 100000
DefaultContents HTML2

Command line:
# ./swish-e -S http -c swish.conf

When I run this I get:

Indexing Data Source: "HTTP-Crawler"
Indexing "http://ublin.lib.buffalo.edu/webcat/bibcat/A/A/E/9/001.html"

But it sits there...uses NO cpu time and the index files are:

-rw-r--r--   1 root     other          0 Jan 25 15:32
index.swish-e.prop.temp
-rw-r--r--   1 root     other     393216 Jan 25 15:32 index.swish-e.temp

They never get bigger or change the timestamp.  I've deleted the index files
and keep getting the same result.

Onward and downward!!!!

Rich
Received on Fri Jan 25 20:40:05 2002