At 09:28 AM 01/29/02 -0800, Gordon Jessop wrote:
>Thanks to Bill for the upgrade hint! Got swish-e 2.1-dev-25 to index (built
That may or may not work in your favor. libxml2 will sometimes fix bad
html, but the default html parser may be able to parse things that are
really not html.
>However, I am now saddled with this problem: I have to index a series of
>jscript source files and the MetaNames function within swish-e does not seem
>to be catching.
Swish knows how to parse HTML, XML, and text. So I'm not sure what you
> EnableAltSearchSyntax yes
> SwishSearchOperators AND OR NOT
> SwishSearchDefaultRule AND
Those are not implemented, AFAIK.
> FollowSymLinks no
> FileRules filename contains "\.jsp$"
I think that should be the same as IndexOnly .jsp
>I have to index a series of jscript source files. Each file would contain
>// <title>Guns and Butter</title>
>globalPackage.description = '<meta_description>Some indexable words like
>supply and demand, guns and butter.</meta_description>';
>globalPackage.author = '<meta_author>Gordon Jessop</meta_author>';
>globalPackage.foo = '1';
>globalPackage.bar = 'checked';
>globalPackage.blah = '123456';
I think you can only use <meta_description> with libxml2. If I remember
correctly, the HTML parser thinks everything <foo> is an HTML tag. Libxml2
knows what are HTML tags, so when I get passed a tag (this is in parser.c)
from libxml2, I know if it's a real HTML tag. If not then I pretend it's a
metaname. That's how that hack works. That's probably why your metanames
are not working.
Back to your source file. What exactly are you expecting to search for?
Once you know that you can adjust your content as necessary to make that
This is what I'd do: I would take DirTree.pl and use that to grab your
files. Then parse the file by regular expressions extracting out what you
need. Then format as XML or HTML and send it off to swish.
That way you have full control over what is indexed, and under what
metanames and properties. Does that make sense?
>Note: Due to imposed constraints, I am unable to use the proper <META
>Name="name" CONTENT="content"> syntax and have settled for the option
>described in the 2.2 docs (i.e. <meta_description>...</meta_description>)
Again, I think that's only for libxml2. Can you remind me where that is in
the 2.2 docs?
Received on Tue Jan 29 17:57:00 2002