We have to check this (more Bill and Jose, because
I'm a little occupied by my job right now...)
But I will do a quick change for DATE values (swishlastmodified),
so we don't get any confusion if -x "<swishlastmodified fmr=/%d/>"
will return strftime "days" or printf "seconds sice epoch".
I will change this to fmt=/%ld/.
Exactly this property format string will return "seconds since epoch"
for DATE type properties.
e.g.
-x "<swishlastmodified fmr=/%d/>" will return day in month (e.g. "6")
-x "<swishlastmodified fmr=/%ld/>" will return seconds since epoch
"%ld" will be a fixed string!
which means fmt="secs: %ld" will _not_ return seconds since epoch.
cu - rainer
> -----Original Message-----
> From: David Wood [mailto:dwood@inter.nl.net]
> Sent: Friday, April 06, 2001 6:59 PM
> To: Multiple recipients of list
> Subject: [SWISH-E] Re: Sorting by swishlastmodified...
>
>
> Hi Bill and Rainer,
>
> Bill, I did see your note about the new "prog" stuff and I'm
> certainly
> interested, but that new spider is more complex than the
> previous one, and
> we have some somewhat weird customisations to the previous
> one, and I just
> haven't had the chance to play around with the new one enough yet.
>
> On the other hand, would the patch below fix the 'old'
> spider? The idea is
> that if you get HTTP code 200 back in swishspider then you write the
> Last-Modified date into the .response file as well, and write
> it in seconds
> since epoch format to save the C code having to muck around with date
> formats, localisation, etc.
>
> cheers,
>
> David
>
>
> ---
>
>
> swishspider:
> 4a5
> > use HTTP::Date;
> 32a34
> > print RESP str2time($response->header("last-modified")) . "\n";
>
>
> httpserver.c:
> 72a73
> > static time_t lastmodified=0;
> 149c150
> < if (get(sw,contenttype,
> &server->lastretrieval, buffer) ==
> 200) {
> ---
> > if (get(sw,contenttype, &lastmodified,
> &server->lastretrieval, buffer) == 200) {
>
>
> http.c:
> 205c205
> < int get(SWISH *sw, char *contenttype_or_redirect, time_t
> *plastretrieval,
> char *url)
> ---
> > int get(SWISH *sw, char *contenttype_or_redirect, time_t
> *lastmodified,
> time_t *plastretrieval, char *url)
> 257a258,263
> > if (code == 200) {
> > /* read last-modified
> > **/
> > fgets(buffer, lenbuffer, fp);
> > *lastmodified = atol(buffer);
> > }
> 372a379
> > static time_t lastmodified=0;
> 406c413
> < if ((code = get(sw, contenttype,
> &server->lastretrieval,
> item->url)) == 200) {
> ---
> > if ((code = get(sw, contenttype, &lastmodified,
> &server->lastretrieval, item->url)) == 200) {
> 443c450
> < fprop->mtime = 0; /* $$ see above */
> ---
> > fprop->mtime = lastmodified;
>
>
> http.h:
> 13c13
> < int get (SWISH *sw, char *contenttype_or_redirect, time_t
> *plastretrieval, char *url);
> ---
> > int get (SWISH *sw, char *contenttype_or_redirect, time_t
> *lastmodified,
> time_t *plastretrieval, char *url);
>
>
>
> At 11:35 06-04-01, you wrote:
>
>
> >On Thu, 5 Apr 2001, David Wood wrote:
> >
> > > Hi folks,
> > >
> > > Using '... -s swishlastmodified desc' _almost_ works
> perfectly. The only
> > > problem I've uncovered is that, if you've created your index via
> > spidering,
> > > there's no swishlastmodified stored, I guess because the
> files aren't
> > local
> > > and stat'able, and so all dates for spidered content come
> back as 31 Dec.
> > > 1969! But don't nicely behaved web servers pass a Last
> Modified HTTP
> > > header to clients? If so, might we be able to use that to set
> > > swishlastmodified when creating a spider-generated index?
> >
> >You might have missed my last post. If you use the "prog"
> method with the
> >provided spider.pl you will get the last modified date, plus it will
> >probably spider faster.
> >
>
>
>
> -----------------------------------------------------------
> This Mail has been checked for Viruses
> Attention: Encrypted Mails can NOT be checked !
>
> ***
>
> Diese Mail wurde auf Viren ueberprueft
> Hinweis: Verschluesselte Mails koennen NICHT geprueft werden!
> ------------------------------------------------------------
>
Received on Fri Apr 6 17:46:23 2001