I've been using swish-e for probably three years. Worked real well, 'til
now. I've got an archive of 65,000 plus pages, (360 MB, all html/text)
that is indexed to nearly 56MB. It's running on a 400 Mhz K6-2, 512 MB
RAM, RedHat 6.0 (a Cobalt RaQ3 server appliance). The index takes about 18
hours to generate.
Up until I indexed the site a few days ago, it was great. We average about
1 million requests a week, and it's not uncommon for 8 or 10 people to be
accessing the archive at one time, of the 100 or so that may be on the site
at any given time. Friday, though, uptime started returning 7.00 or so
utilization. We are normally around 1.0. When I got back from lunch,
uptime returned 91.00 plus utilization. So bad we had to hard power down
the box, as it wouldn't respond to a command line shut down.
Prior to the hard power down, I ran ps aux, and it returned about 100
"stalled" swish-e instances, each one consuming about 1% or so of the CPU,
and hardly any memory. After reboot, I took the search URL off line and
started testing. A single instance of searching that index, immediately
takes CPU utilization to 98%, at which point it hangs, and returns nothing.
I'm using swish-cgi.pl for the Web interface. I end up killing the swish-e
process. I've deleted the initial index, and I'm regenerating the index.
Should be done tomarrow.
Have I hit the wall with swish-e? (hope not) Has anyone used swish-e for a
site/ archive this large?
Received on Sun Nov 19 06:28:35 2000