I'm trying to use Swish-e for a project in the Library at San Francisco
State and am beginning to wonder if it is the right tool. Sorry for the
long post, but I think some background will be helpful.
We get a list of Journals for which we have online access to fulltext
articles from a vendor in either html or xml. We're talking, say 20 -
40,000 journals. The list is exported as separate docs for each letter of
the alphabet, where A--.html has all the journals that start with the
letter "A". I've indexed both the HTML and XML versions of the files and
can search them using the Swish.cgi program.
The problem is how the found set is returned. Searching for American
Chiropractor, for instance, tells me that the journal is found in A--.html.
But I can't get Swish-e to return any of the more useful data elements:
Journal title, ISSN, Coverage, Source, which are all present in the indexed
files. This seems like a situation where the structured nature of XML
should be useful, so I've focused on working with XML Docs. One problem may
be that the format the vendor uses is xml-marc, which seems to give Swish-e
some trouble. Here's a snippet of what the data looks like:
-<datafield tag="022" ind1="" ind2="">
-<datafield tag="245" ind1="" ind2="4">
<subfield code="a">The American chiropractor</subfield>
-<datafield tag="210" ind1="" ind2="">
<subfield code="a">AMERICAN CHIROPRACTOR</subfield>
-<datafield tag="090" ind1="" ind2="">
-<datafield tag="866" ind1="" ind2="">
<subfield code="x">Alt-HealthWatch:Full Text</subfield>
<subfield code="a"> Availability: from 1998</subfield>
I've experimented with XMLClassAttributes and UndefinedXMLAttributes,
without much luck.
What I'd like is to see is a search result like this:
AMERICAN CHIROPRACTOR (0194-6536)
Availability: from 1998
Am I barking up the wrong tree trying to get this to work with Swish-e?
Digital Systems Design and Development Coordinator
J. Paul Leonard Library, San Francisco State University
415-338-2285 | email@example.com
Received on Thu Feb 12 14:59:18 2004