Skip to main content.
home | support | download

Back to List Archive

Re: Indexing PDFs on Windows - Revisited....

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Sep 23 2004 - 19:16:46 GMT
On Thu, Sep 23, 2004 at 10:41:29AM -0700, Anthony Baratta wrote:
> I'm confused about how to tweak the default setup to use the keep_alive 
> versus only the delay.

The "default" setup does use the keep_alive feature.  keep_alive, as
I'm sure you know, allows multiple requests over the same TCP
connection to the server -- so using keep alives saves the connection
overhead and the time that each server process waits after closing the
connection.

> OK - I see that now. Appears that the PDFs are gettting descriptions but 
> my html/asp pages are not. I think this might be becuase my body tags 
> have attributes?
> 
> e.g.
>      <body leftmargin="0" topmargin="0" rightmargin="0"
>       marginwidth="0" marginheight="0">

No.  But it's something you can test:

    moseley@bumby:~$ cat c
    DefaultContents HTML*
    StoreDescription HTML* <body> 50

    moseley@bumby:~$ cat 1.html
    <html>
    <head>
    <title>New 17" flat screen!</title>
    </head>
    <body leftmargin="0" topmargin="0" rightmargin="0" marginwidth="0" marginheight="0">

    Body Content

    </body>
    </html>

    moseley@bumby:~$ swish-e -v0 -c c -i 1.html -T properties 
              swishdocpath: 6 (  6) S: "1.html"
                swishtitle: 7 ( 20) S: "New 17" flat screen!"
              swishdocsize: 8 (  4) N: "175"
         swishlastmodified: 9 (  4) D: "2004-09-23 12:14:03 PDT"
          swishdescription:10 ( 12) S: "Body Content"

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Thu Sep 23 12:16:57 2004