Re: Problem with description + suggestion for new functionnality

From: Bill Moseley <moseley(at)>
Date: Fri Dec 16 2005 - 12:57:35 GMT
On Fri, Dec 16, 2005 at 02:49:46AM -0800, François Tissandier wrote:
> Reading "UndefinedMetaTags index The contents of the meta tag are
> indexed, but placed in the main index unless there's an enclosing
> metatag already in force. This is the default",
> I thought I just needed to make sure that "description" was NOT in
> the list of metanames in my config file. This way the content of the
> meta would be indexed in the main index. That's what I did, but
> still, looking for "tech serv" doesn't return me the page... 

Depends on what your config is and what your document looks like.
Post a very small example.

By the way, there's no "main index" -- everything is indexed under a
metatag.  And when a metatag is not specified via the config then
swish uses the "swishdefault" metatag.

> Also I have a proposal for a new functionnality, maybe it's already
> possible or someone proposed it, but here it is:
> -I index on products, but not on their PDF technical datasheet. I
> would like to be able to index both the webpage for the product AND
> the datasheet, but to merge the result into only one answer.

Create a SWISH::Filter that "filters" text/html.   When it processes
the web page then look for the associated pdf file, extract out the
content and add it into the content of the html document.

> I don't want to have the PDF url as an answer, a PDF is a dead-end
> for a search, you can't click on anything.

You mean the PDFs are not available online?  If they are available
online then seems like a search returning the pdf might be just what
someone is looking for.  But I see your point if you want to just the
search to point to a product page.

> Instead, I would like to have the url of the product file. I know I
> can use a Replace rule in the config file to show the webpage url
> instead of the pdf url, but if I search on a word being in BOTH the
> webpage and the PDF, it will return me the url of the webpage TWICE,
> won't it?

Yes, it would.

Bill Moseley

