Skip to main content.
home | support | download

Back to List Archive

Memory Problems while indexing

From: Klingensmith, Rick <klingensmith(at)>
Date: Thu Sep 11 2003 - 15:33:37 GMT
I experienced this problem when installing SWISH-s and resolved it by using
the -e option and pointed the Temp-Dir to a drive that has lots of space
(almost 100GB). Now it appears to be an issue again. I've also tried
indexing by in a command window using my admin account and receive the same
error. We use the following command to execute the index:
D:\ProgramFiles\Swish-E\swish-e -S http -e -c
D:\ProgramFiles\Swish-E\conf\siteindex.config. The config file looks like


# ----- SiteIndex.config - Spider using "http" method -------


#  Please see the swish-e documentation for

#  information on configuration directives.

#  Documentation is included with the swish-e

#  distribution, and also can be found on-line

#  at



#  This example demonstrates how to use the

#  the "http" method of spidering.


#  Indexing (spidering) is started with the following

#  command issued from the "d:\Program Files\Swish-e" directory:


#     swish-e -S http -c Siteindex.config


#  Note: You should have the current Bundle::LWP bundle

#  of perl modules installed.  This was tested with:

#     libwww-perl-5.53


#  ** Do not spider a web server without permission **




# Include our site-wide configuration settings:


IncludeConfigFile D:/ProgramFiles/Swish-E/conf/Settings.config


# Specify the URL (or URLs) to index:




# If a server goes by more than one name you can use this directive:


# EquivalentServer




# This defines how many links the spider should

# follow before stopping.  A value of 0 configures the spider to

# traverse all links. The default is 5

# The idea is to limit spidering, but seems of questionable use

# since depth may not be related to anything useful.


MaxDepth 10


# The number of seconds to wait between issuing

# requests to a server.  The default is 60 seconds.


Delay 1



# Skip pages with Meta tag "noindex"


obeyRobotsNoIndex yes



# (default /var/tmp)  The location of a writeable temp directory

# on your system.  The HTTP access method tells the Perl helper to place

# its files there.  The default is defined in src/config.h and depends on

# the current OS.


TmpDir D:/Inetpub/Indexes/Temp



# The "http" method uses a perl helper program to fetch each document

# from the web called "swishspider" and is included in the src directory of

# the swish-e distribution.


SpiderDirectory D:/ProgramFiles/Swish-E


# Put the index files in the Inetpub/Indexes directory

IndexFile D:/Inetpub/Indexes/SiteIndex.New.index



# end of SiteIndex Config file


I am receiving the following warning in my log files from the indexing job:
Warning: Configuration setting for TmpDir 'D:/Inetpub/Indexes/Temp' will be
overridden by environment setting 'C:\DOCUME~1\rek\LOCALS~1\Temp' which does
not exist. When I look in the specified temp directory I've found SWISH-e
work files so I'm not sure if this is a problem or not.


The summaries of the last good index on 9/8 look like: 

1468 files indexed.  39839610 total bytes.  810188 total words.

Elapsed time: 00:32:05 CPU time: 00:32:05

Indexing done!


We are using the latest windows version of Swish-e on a Windows 2000 server.


The archives and FAQ point to the -e option to fix memory issues. What have
I missed?




Richard Klingensmith

MSU Human Resources Information Systems

1407 S. Harrison Road Ste. 40

East Lansing, MI 48823

(517) 432-4636 ext. 155


Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
Received on Thu Sep 11 15:33:46 2003