There are a few improvements I would like to see:
* Search by date.
One way to do it: When indexing, store the last-mod date of the file
being indexed, then, when searching I can supply two arguments: "before"
and "after" and only get the hits between these dates. I have hacked
this in myself, using the "filesize" field for the timestamp, and just
dropping hits not within the specified interval. (Who needs "filesize"?)
The best way (for me) would be to be able to pass a date as an argument
to the indexer. (Of course this only slightly more work than the way
mentioned above, but my time ran out...)
Actually, I thought this would be important for many sites so I couldn't
really believe this feature was missing when I looked for it.
* Run swish-e (search) as a server.
Yes! We call swish-e using exec() from a Java application. This has the
- Every search results in a fork() which means that there will be a copy
of the forking process in memory. If that process is very large, we have
- It takes time. Which, of course, is critical when searching.
If we had a server that the Javaprogram could call, it would still fork
but have a much smaller memory footprint. (Or a version of swish-e
completely written in Java, but I realize that it would be to ask for
I know people who have tried to make a Java "searchserver" by trying to
wrap swish-e in a Java class by using Java Native Interface (JNI). But
since swish-e (probably) was desingened to perform just one search and
then exit, it leaks memory. This means that the whole Java VM will leak
and we cant have that.
About a year ago I tried to do the same thing with ffwsearch, and it was
_no_fun_ trying to find and fix all the memory leaks.
I realize that the way we use swish-e might not be very common, and thus
the improvements I mentioned above might not seem useful enough to many
other users. Still, I think there ought to be lots of people who want to
call a search engine from there own applications without too much fuzz.
/Stefan Bergstrand - Polopoly <http://www.polopoly.com>
Received on Sun Jun 11 06:11:08 2000