Skip to main content.
home | support | download

Back to List Archive

Re: maximum size of files

From: <jmruiz(at)not-real.boe.es>
Date: Wed Jan 07 2004 - 18:13:54 GMT
Hi,

Here is the test...

The Config file (dict.config):
---- cut here ----
IndexDir /your_path/dict.pl
IndexFile dict.index
DefaultContents XML2
Metanames doc
PropertyNames doc
---- cut here ----

Script dict.pl...
---- cut here ----
#!/usr/bin/perl

# Basic data

# Dict file wiht words. One word per line.
$dict='/usr/share/dict/words';

$min_words_per_file=500;
$max_words_per_file=1000;
$max_files=2000000;

# Load words
open DICT,"<$dict";
for($num_words = 0; $words[$num_words] = <DICT>; $num_words++) 
    { chop $words[$num_words] }
close DICT;

srand;

for($i = 0; $i < $max_files; $i++)
{
    $this_file_words = int( rand( $max_words_per_file - $min_words_per_file + 1
) ) + $min_words_per_file;
    $doc = '';
    for($j = 0; $j < $this_file_words; $j++)
    {
        $doc .= $words[int( rand( $num_words - 1 ))].' ';
    }
    $doc = <<EOF
<?xml version="1.0" encoding="ISO-8859-1"?>
<doc>
$doc
</doc>
EOF
;
    $size = length $doc;
    $mtime = time;
    print <<EOF
Path-Name: $i
Content-Length: $size
Last-Mtime: $mtime
Document-Type: XML2

EOF
;
    print $doc;

}
---- cut here ----

To index...
swish-e -S prog -c dict.config -e -v 3

cu
Jose

On 6 Jan 2004 at 22:07, J Robinson wrote:

> --- David L Norris <dave@webaugur.com> wrote:
> > ...
> > I have attempted to create large index files on
> > Win32 using Jose's test
> > script and achieved up to a 2 GB index with 3.5 GB
> > prop files. ...
> > 
> 
> Is this test script available somewhere? I looked in
> CVS and the latest dev snapshots but didn't see it. 
> Thanks
> jrobinson
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
> http://hotjobs.sweepstakes.yahoo.com/signingbonus
Received on Wed Jan 7 18:14:04 2004