Skip to main content.
home | support | download

Back to List Archive

using the library: thread/process safety and memory issues

From: Jerry Asher <jerry(at)not-real.hollyjerry.org>
Date: Sat Jun 02 2001 - 17:53:53 GMT
Thanks folks, and I apologize for not checking the archives to ask about 
2.1.  (In my defense, this may be a good thing to add to the regular 
website...)

I do have a thread safety question that I couldn't find in the archives 
(perhaps you can point me to a discussion.)

I want to create an AOLserver interface to swish-e.  Simplistically, 
AOLserver is a GPL'd C based, multithreaded Tcl extension webserver.  (I 
say that in that it uses Tcl 8.4 and is written in the same style as Tcl 
itself.)  In AOLserver, each web request is responded to by a different 
thread.  Each thread gets its own Tcl interpreter AND there is a way for 
each thread to share a global Tcl data structure (basically a hash table.)

Ideally, I would like each thread to simply call a function that performs 
the SwishOpen, SwishSearch, SwishNext, SwishClose loop and returns a list 
of hits.

But it sounds as though
a)  I might run into thread safety issues.
b)  I might run into memory issues

Can someone clarify this?  What issues are likely to arise?  And let me 
make sure, after a SwishClose, is the memory for that search returned?

>http://sunsite.berkeley.edu:4444/INSTALL.html#Installing_the_SWISH_E_C_Library
>
>The library will allow you to embed swish into another application.  This
>allows you to avoid the forking of an external program (swish) and can be
>used to keep an index loaded between search requests, which should be
>faster.  The trade off is memory, as, for example, if you created an Apache
>module you would end up with a copy of swish in every Apache child process,
>including the memory used by the index.
>
>My limited experience when testing an embedded swish on my linux machine
>with Apache was that I didn't see much shared memory, and that linux was so
>good at caching the swish-e binary that I didn't see that much improvement
>in speed between using the library and the binary when tested with Apache
>Benchmark.

Can you clarify this?  Which binary are you referring to?  What was your 
architecture?  I am not that familiar with Apache, am I right to think the 
conventional swish solution is a cgi/perl based process forking solution?

>There's been a lot of work to make swish thread safe, with the goal of
>building a swish server, which would be a lot nicer on memory usage for
>something like Apache.

Now regarding thread safety, there are two alternate approaches I can take:

I might:

1.  devote one thread within AOLserver just for running SWISH and have the
     other threads delegate all their searching to that one thread multiplexing
     and coordinating by way of queues and mutexes

2.  build an external process swish-e server (harder but perhaps better for the
     swish-e community)

Implementing (1), is pretty trivial in AOLserver, though I suspect 
performance will be less compared to the ideal solution 
above.  Implementing (2) is probably more a long run solution, but maybe 
not.  An oddly phrased question: how stable is the swish-e library?  Am I 
likely to experience memory corruption in the swish-e datastructures (or 
elsewhere in my process) by using the 2.1 library?  If so, I would prefer 
to create the external process swish-e server today!

So regarding a swish-e server, does anyone have a suggested 
architecture?  And one final question: years ago, I heard of some unix 
utility that you could wrap around a library to turn it into a simple 
daemon, does anyone have a clue as to what it was I heard?

Thanks,

Jerry

=====================================================
Jerry Asher                       jerry@hollyjerry.org
1678 Shattuck Avenue Suite 161    Tel: (510) 549-2980
Berkeley, CA 94709                Fax: (877) 311-8688
Received on Sat Jun 2 18:07:17 2001