At 09.07.2002 08:57 -0700, you wrote:
>At 02:20 PM 07/09/02 +0200, Guido Adam wrote:
> >And the metatags are not read, if you leave out IndexContents.
>Both metanames and propertynames work for me without IndexContents. You
>have to using a parser that knows how to extract out the metanames. The
>default parser is HTML if you do not specify a parser, and that will parse
><meta> tags only (not fake html <tag> meta tags). If you had a header
>Document-Type: TXT then it won't parse the metanames.
>[hum, I think the default parser should be HTML2 if available]
> >My database records contain html pages.
> >Looks like the "Document-Type:" field is not read correctly by the indexer,
> >if you use the "-S prog" switch. The indexer should use that field and not
> >the filetype it extracts from the URL.
>Check again. If you have in the -S prog program's output:
> Path-Name: foo.html
> Document-Type: HTML2
>and in your swish config you say:
> IndexContents TXT .html
>it will still use the header specified in the prog's headers (HTML2), not
>the TXT parser.
Here we are: I changed my Document-Types from HTML to HTML2 and all is as
I can leave out IndexContents and meta tags and properties are as they
The HTML parser seems to have problems here.
Received on Tue Jul 9 18:10:47 2002