OK, some informations:
I don't want to give online access now because it's on a workstation and
I don't want too much load.
The wrapper is part of a content management engine of our own. I try
currently to extract just the Swish wrapper off the servlets package to
put it in a .jsp that will be more easy to distribute and modify.
It needs to run a servlet runner (I tried JRun 3.1 - NT and Tomcat 4 -
Linux successfully) and the following libraries:
Jakarta ORO - Java RexExp
Xerces - XSLT Processor
Xalan - XML Parser
JTidy - for HTML encode/decode
Here is what the import section looks like:
<%@page import="java.io.*"%>
<%@page import="java.util.*"%>
<%@page import="org.apache.oro.text.perl.*"%>
<%@page import="org.apache.oro.text.regex.*"%>
<%@page import="org.w3c.tidy.EntityTable"%>
<%@page import="javax.xml.transform.Source"%>
<%@page import="javax.xml.transform.Result"%>
<%@page import="javax.xml.transform.stream.StreamSource"%>
<%@page import="javax.xml.transform.stream.StreamResult"%>
<%@page import="javax.xml.transform.Transformer"%>
<%@page import="javax.xml.transform.TransformerException"%>
<%@page import="javax.xml.transform.TransformerFactory"%>
<%@page import="javax.xml.transform.TransformerConfigurationException"%>
<%@page import="javax.xml.transform.Templates"%>
<%@page import="org.xml.sax.SAXException"%>
There are two utility classes (made in Capella :)):
StringUtils with following methods: nullToValue, HTMLDecode, split and
replace (regexp based)
XMLUtils with following method: applyXSL
The main class reads a config file and puts default values in a
hashtable (swish exec path, catalog, params...)
I use the following method to obtain stuff from Swish (Windows here,
Linux slightly different):
//**params uses the -x operator syntax of Swish like this (in the config
file)**//
String params =
"\"<swishrank>\\t<swishdocpath>\\t<swishtitle>\\t<swishdescription>\\t<description>\\t<swishdocsize>\\t<swishlastmodified>\\n\"";
//that way, the list of fields is dynamic and the XML too!
String[] command = new String[3];
command[0] = "cmd.exe";
command[1] = "/c";
command[2] = swishPath + " -m " + maxResults + " -b " + recOffset + " -f
" + catalog + " -w \"" + words + "\" -x " + params;
Process process = Runtime.getRuntime().exec(command);
//Then I use a buffered reader like this:
BufferedReader input = new BufferedReader(new
InputStreamReader(process.getInputStream()));
//results iterating
while ((line = input.readLine()) != null) {
//parsing, building XML string (use of StringUtils)
}
//applying XSL transformation (use of XMLUtils)
What's really good with this is that you have a low-level object that
produces an XML string and you can modify the XSL to adjust your
layout. The XML has the following structure:
<search>
<page_offset>0</page_offset>
<nresults>12</nresults>
<!--will become a .jsp-->
<self_servlet>org.capella.ed.servlets.Search</self_servlet>
<words>capella + other</status>
<results>
<result>
<rank>1000</rank>
<url>www.capella.org</url>
<title>capella</title>
<size>40000 octets</size>
<!--more fields (dynamic)-->
</result>
<!--more results-->
</results>
<pos>
<from>1</from>
<to>5</to>
<on>12</on>
</pos>
<nav>
<!--prev and next page offset-->
<prev>0</prev>
<next>1</next>
<!--a random number to force a reload : search.jsp?z=0.000333-->
<z>0.000333</z>
</nav>
</search>
So, keep in touch, I'll try to package it as soon as possible, probably
next week or so and I'll post the .jsp then.
--
--------------------------------
Jean-Michel David
président et directeur technique
Capella Technologies
jmdavid@capella.org
514-849-1494 ext. 105
866-849-9873
Received on Wed Mar 6 18:28:39 2002