Skip to main content.
home | support | download

Back to List Archive

Re: The swish-d cluster system is ready for beta

From: Dave Seff <dseff(at)not-real.advisen.com>
Date: Wed Mar 16 2005 - 20:17:24 GMT
On Wed, 2005-03-16 at 09:07 -0800, Bill Moseley wrote:
> On Wed, Mar 16, 2005 at 08:43:56AM -0800, Dave Seff wrote:
> > The cluster manager connects to each swishd node and sends the search
> > query (in XML) and recieves results from each node. For now the process
> > is linear as I am still working out the threading issues. It then
> > collates the results and sorts them by rank and returns them to the
> > client, also in XML format.
> 
> Does each node have a separate index?  If so, how does ranking work?
> 

Each node can have one or many indecies. In my test system I had the
following:

4 machines 3 running swishd and 1 running cluster_mgr. 

Box 1 (swishd) had 1 index (newsedge.idx)
Box 2 (swishd) had 5 indecies (newsedge1.idx, newsedge3.idx . . . .)
Box 3 (swishd) had 12 indecies (dowjones[1 -12].idx)

The cluster_mgr sends the query based on a collection tag <Collection>.
I used the following query:

<?xml version="1.0" encoding="UTF-8"?>
<Document>
 <Collection>NEWSEDGE</Collection>
 <Collection>DOWJONES</Collection>
 <Query>Firefox</Query>
</Document>

Each swishd ran the search and returned results to cluster_mgr. Then
cluster_mgr sorted the results by rank to look something like the
following and sends back to the origional client. Notice that the
results are in reverse order by rank:

<Result><Path>/data/documents/NEWSEDGE/2004-05-06/28106149.200405061178.17_ef7b0000003fbcda.xml</Path><Rank>1000</Rank><Size>92003</Size><Title>(null)</Title><Index>/app/newsedge.idx</Index><Modified>2004-05-06 00:42:01 EDT</Modified><Record>1</Record><File>59971</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-10-21/31460711.200410211178.9_b11c000000626826.xml</Path><Rank>1000</Rank><Size>194465</Size><Title>(null)</Title><Index>/index/Index/newsedge_10_2004.idx</Index><Modified>2004-10-21 05:13:53 EDT</Modified><Record>1</Record><File>491465</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-09-09/30438923.20040908475.5_09590edab491c2e4.xml</Path><Rank>1000</Rank><Size>86907</Size><Title>(null)</Title><Index>/index/Index/newsedge_09_2004.idx</Index><Modified>2004-09-09 21:25:15 EDT</Modified><Record>1</Record><File>187946</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-08-10/29804513.20040801700.3_a900000000018b78.xml</Path><Rank>1000</Rank><Size>81573</Size><Title>(null)</Title><Index>/index/Index/newsedge_08_2004.idx</Index><Modified>2004-08-10 12:49:17 EDT</Modified><Record>1</Record><File>166248</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-09-22/30711226.20040922665.1_b53f0ad8e9934f5a.xml</Path><Rank>987</Rank><Size>69831</Size><Title>(null)</Title><Index>/index/Index/newsedge_09_2004.idx</Index><Modified>2004-09-22 09:56:42 EDT</Modified><Record>2</Record><File>438835</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-08-19/29995135.20040818700.25_9800000000016d52.xml</Path><Rank>883</Rank><Size>38341</Size><Title>(null)</Title><Index>/index/Index/newsedge_08_2004.idx</Index><Modified>2004-08-19 10:50:16 EDT</Modified><Record>2</Record><File>338936</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-09-16/30599975.20040916475.5_26f304bada84f229.xml</Path><Rank>867</Rank><Size>51296</Size><Title>(null)</Title><Index>/index/Index/newsedge_09_2004.idx</Index><Modified>2004-09-16 21:14:40 EDT</Modified><Record>4</Record><File>336312</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-09-29/30891692.20040928475.5_02770587ab25d297.xml</Path><Rank>807</Rank><Size>54607</Size><Title>(null)</Title><Index>/index/Index/newsedge_09_2004.idx</Index><Modified>2004-09-29 20:27:26 EDT</Modified><Record>6</Record><File>605972</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-05-08/28164061.200405071480.55_9e8f000e380d5a36.xml</Path><Rank>723</Rank><Size>9619</Size><Title>(null)</Title><Index>/app/newsedge.idx</Index><Modified>2004-05-08 20:40:46 EDT</Modified><Record>2</Record><File>110972</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-05-07/28155740.20040506700.49_da2a002b0521b5cd.xml</Path><Rank>723</Rank><Size>11487</Size><Title>(null)</Title><Index>/app/newsedge.idx</Index><Modified>2004-05-07 11:26:08 EDT</Modified><Record>3</Record><File>104208</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-05-06/28108027.200405061178.25_1a0d0000003c8feb.xml</Path><Rank>723</Rank><Size>48468</Size><Title>(null)</Title><Index>/app/newsedge.idx</Index><Modified>2004-05-06 02:19:52 EDT</Modified><Record>4</Record><File>61759</File></Result>
<Result><Path>/data/documents/NEWSEDGE/2004-08-25/30117427.20040825475.5_128f046688013264.xml</Path><Rank>723</Rank><Size>47333</Size><Title>(null)</Title><Index>/index/Index/newsedge_08_2004.idx</Index><Modified>2004-08-25 20:47:15 EDT</Modified><Record>4</Record><File>451078</File></Result>


-- 
--
"It's too bad carbon dioxide isn't flamable, otherwise we would all be
fire-breathing people." --Dave Seff
Received on Wed Mar 16 12:17:28 2005