Greg Keith wrote on 4/2/09 3:52 PM:
> I'm returning to a thread I started (which Peter kindly replied to) a
> few weeks back - I was wondering why some of my HTML documents were
> not having their titles found by swish-e.
>
> As suggested, I created a test, and noticed that, sure enough, the
> titles of the HTML documents I was indexing were mostly not being
> found - that is because while many of the documents are straight HTML,
> many use SSI variables for the title, so they look like this:
>
> <!--#set var="title" value="Intranet: Directories" -->
>
> Is there any way to get swish-e to use "Intranet Directories" as the
> document title found, extracted from this SSI variable?
there's no way to tell the swish-e command to treat a comment as anything else,
afaik.
Jordan's suggestion (to use the spider.pl with -S prog) is the cleanest.
Otherwise, you'd need to write a fs filter with -S prog.
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Apr 2 21:29:37 2009