Net Virtual Mailing Lists scribbled on 7/7/05 4:46 AM:
>
> . Yet I can do things like "category=bc" and get a result....
>
>
> I originally tried doing:
>
> <listing>
> <id>278232</id>
> <category>a</category>
> <category>a.b</category>
> <category>a.b.b</category>
> <category>a.b.b.g</category>
> <category>a.d</category>
> <category>a.d.c</category>
> <category>a.d.c.bc</category>
> </listing>
>
> . but this didn't seem any better.... I feel as though I am missing
> something very basic here, might you know what it is?....
>
you need to add a period as a valid WordCharacters -- the the *Characters config
params.
> What I would really like is a way to say something like "swish-e -w UNIX'
> and have it return to me something like this:
>
> a 15
> a.b 15
> a.b.b 5
> a.b.b.g 2
> a.b.b.h 3
> a.b 10
> a.b.g 10
> a.b.g.b 10
>
> .. where the number to the right is the total count of matching records
> for each category.
>
> Is what I am after here possible with Swish-E? I know that I can feed
> the output of it into a script to generate this summary, but this is slow
> work... I know nothing about Swish-E is architected at this point, but
> it almost seems like Swish-E would need to have everything it needs to
> internally generate this summary very quickly.
Swish-e is just a text indexer. It can keep track of text, and the context
(MetaNames) in which the text is found, and can even store the text itself (as a
Property). But it doesn't have any features for summarizing results like you're
describing.
However, I can imagine some ways to still get what you want. If you knew all the
possible categories you were interested in, you can use the API to perform a
series of searches on an open index (or indexes) and still make it go pretty fast.
Example (in Perl) (UNTESTED!):
use SWISH::API;
my $swish = SWISH::API->new( 'index.swish-e' );
my $q = 'UNIX';
my @categories = qw( a a.b a.b.b a.b.b.g a.b.b.h a.b.g );
my %count;
for my $c (@categories)
{
my $results = $swish->Query( "$q and category=$c" );
$count{$c} = $results->Hits || 0;
}
# do something with the count
for my $c (@categories)
{
print "$c $count{$c}\n";
}
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Thu Jul 7 05:11:26 2005