Skip to main content.
home | support | download

Back to List Archive

RE: SWISH-E Book

From: Deane Barker <dbarker(at)not-real.siouxfallscommercial.com>
Date: Sun Dec 07 2003 - 01:50:59 GMT
> Why isn't there a Swish-e O'Reily book? 

There's a new book publishing outfit called Packt Publishing that's
putting out books on a lot of open-source products (Zope, Plone, etc.).
They're a bunch of ex-Wrox writers.  The model is on-demand publishing,
I gather.

I talked to them for a while about writing a Movable Type book.  In the
end, I didn't think I had the time, but from going back and forth with
them, it strikes me that Swish-E would be a perfect topic for them to
handle.

Deane


-----Original Message-----
From: swish-e@sunsite.berkeley.edu [mailto:swish-e@sunsite.berkeley.edu]
On Behalf Of Dave Stevens
Sent: Saturday, December 06, 2003 7:25 PM
To: Multiple recipients of list
Subject: [SWISH-E] converting .temp indices to usable indices


The spider is doing pretty well, nearly a million pages crawled in the
last couple of weeks.  One issue I just came on is with a dynamic site
that hosts several trade publications using a common app to provide
content from each of the pubs.  The URL
mag1.com/bg.asp?manufacturer=15?mag=20 is the same as
mag2.com/bg.asp?manufacturer=15?mag=20.  The app only uses the argumetns
from the URL, not the domain name.  For future crawls I'm pretty sure I
can filter what I want only and crawl this site on it's own. (I want
mag=7)  It appears I can do that with a callback.

The issue here is that this crawl is about four days old and has about a
dozen other sites in the index.  The prop.temp file and the .temp index
file are being written.  If I kill this crawl by terminating spider.pl,
is there any way convert those .temp files left by the terminated crawl
to usable indices?  This one has so much junk in it that it probably not
usable for this, but I'd like to get a look at what was spidered from
the other 12 sites.  I've looked in the archive and manual and couldn't
find anything.

Why isn't there a Swish-e O'Reily book? ;-) The docs and list are really
good but a larger reference with more production examples would be a
great help.

TIA

Dave
http://charlotte.roaddog.com/
Received on Sun Dec 7 01:51:01 2003