This is a change in how swish-e selects files for indexing.
Currently swish marks files as "already indexed" when a file is skipped
via FileRules. The problem is if that file is available under a
different name (as a symbolic link), it will not be indexed since the
file is flagged as "already indexed."
The new code will still only index a file once, but you can use
FileRules to set which file name will be selected for indexing.
The issue is basically this: Say you have symlinks pointing to
something real.
dir $HOME/foo
link1 -> ../real
liin2 -> ../real
link3 -> ../real
with the existing code you basically cannot decide which link* file is
used to index the real stuff (it will be link1).
With the new code you can do this:
FileRules filename is link1
FileRules filename is link3
and force it to index the contents of ../real under the file name link3
Same thing with directories. Say you have this structure:
April
april_doc (report for April)
next -> ../May (link to the next month)
May
may_doc
next -> ../June
June
june_doc
The old swish-e way would index these docs:
April/april_doc
April/next_month/may_doc
April/next_month/next_month/june_doc
And May and June directories are not processed because they already have
been by following the "next_month" symlinks.
If you tried to exclude "next_month" with
FileRules dirname contains next_month
it would skip the "May" documents because "April/next_month"
was skipped by the FileRules setting, but also "May" was flagged as
already indexed because swish-e visited the symlink April/next_month
which points to May.
So, with the the updated code and the above FileRules to
skip "next_month" it will index:
April/april_doc
May/may_doc
June/june_doc
Now, there's currently no way to force swish-e to index the same file
twice from two different symlinks. I don't really see that as a
problem.
--
Bill Moseley
moseley@hank.org
Received on Wed May 26 13:21:47 2004