Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Indexing page and chapter of a book (single document)

From: David Brown <dave(at)>
Date: Tue Jan 12 2010 - 15:12:11 GMT
Peter observed: " disk space is less a concern than it was even 5 years ago

Unless, of course, you happen to still be hosting your client's site with
the same virtual server provider you were 15 years ago which figured out
they could offer unlimited bandwidth if they gave you very limited disk
space :-) 

But for something relatively unchanging, the cached file approach makes good
sense; you could probably even build the index on a development machine and
copy the files to deploy. 

Dave Brown

-----Original Message-----
[] On Behalf Of Peter Karman
Sent: Sunday, January 10, 2010 12:14 AM
To: Swish-e Users Discussion List
Subject: Re: [swish-e] Indexing page and chapter of a book (single document)

David Brown wrote on 1/9/10 7:45 AM:
> Quick thought for you. if you can add the XML to the files, you should 
> be able to write a program that uses those tags to present each chapter 
> (or even page) individually to swish-e while indexing (programmatically, 
> not via the file system) and still refer to the location of the 
> composite file, possibly even using anchor tags (mybook.htm#chap3 for 
> example).

Dave has the right idea.

swish-e can only report properties (page, chapter, etc) per document. Each
represents exactly one document.

So your best bet is to break each file into virtual "pages" and store the
and chapter values in meta tags. You could do that with the -S prog input
and a script that parses each file and generates the appropriate xml output.

IME, it's easiest to actually write the xml output to a cached file and then

index with the -S fs method (i.e., create actual files rather than virtual 
ones). Caching them as real files makes it easier to debug and then reindex
necessary. Terabyte drives being as cheap as they are these days, disk space
less a concern than it was even 5 years ago.

Peter Karman  .  .  peter(at)
Users mailing list

Users mailing list
Received on Tue Jan 12 10:12:24 2010