Skip to main content.
home | support | download

Back to List Archive

RE: ignoring words inside form elements

From: Bill Moseley <moseley(at)>
Date: Thu Apr 12 2001 - 14:05:10 GMT
At 01:10 AM 04/12/01 -0700, wrote:
>Define these words as stop words.

I think Myke wanted to index those words, but only when not in a <select>

And as far as speed:  In my tests I only had four of about 600 documents
that had <select> tags, so I was parsing a lot of HTML for not good reason.

Adding this right after the content-type check

    return 1 unless $$content_ref =~ /<select/i;

Made the speed about equal to running without that <select> filtering.

Bill Moseley
Received on Thu Apr 12 14:06:07 2001