Skip to main content.
home | support | download

Back to List Archive

Re: remove entries from database

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Jun 30 2003 - 03:43:25 GMT
On Fri, Jun 27, 2003 at 06:43:24AM -0700, Aaron Bazar wrote:
> Hi!
> 
> Is there a way to remove entries from the swish-e database? It does not have
> to be an automatic way.. can the file be edited?

No, not currently.  And I doubt it could easily be edited.  Reindexing 
is usually best.


> I have many URL's in my
> database, from many different domains... I would like to remove ALL the
> entries from the database that are from one of these domains.
> 
> One idea, is to re-spider the URL in question, but give it a bogus IP
> address such that when the URL is spidered, no information is found. I could
> then merge this small file with my huge database file... would this delete
> all the entries?

If you wanted to "remove" a file, say "test.html" from the index you 
might be able to create a new "test.html" that contained a dummy word 
(I think swish-e won't index empty files) and then merge that index with 
your initial index.  That file will still be in the index, and it will 
get returned no "not" searches.

I have not looked at the merge code in quite a while.  But I suppose you 
could pass in a list of file names and have swish-e write out a new 
index with those files skipped.  merge.c is the place to look.


-- 
Bill Moseley
moseley@hank.org
Received on Mon Jun 30 03:43:36 2003