Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] swish.cgi: Escaping special characters in $link_href

From: Bill Moseley <moseley(at)>
Date: Wed May 13 2009 - 20:19:20 GMT
On Wed, May 13, 2009 at 08:45:53PM +0200, wrote:
> Hi,
> I'm stumbling across a known deficiency in the swish.cgi script, where
> special characters in link_href are not correctly escaped. The code reads:
> # Replace spaces ***argh this is the wrong place to do this! ***
> # This doesn't really work -- file names could still have chars that need
> \ to be escaped.
> $link_href =~ s/\s/%20/g;
> The comment in the code makes it clear that the author is aware of the
> problem, but apparently he hasn't found a solution yet.

Was aware, perhaps.

Frankly, I have no idea why in swish.cgi there's anything associated
with escaping.  That's needed for HTML, not for internal data.  Maybe
because swish.cgi was a quick hack that hasn't go away.  Isn't
search.cgi the newer, cleaner approach?

Typically, what you would do is the Template Toolkit template that
displays the path:

    <a href="[% item.swishdocpath | html %]">

To escape all characters.  item.swishdocpath would probably have
to be encoded into utf8 first (i.e. not perl characters), although
it's probably already bytes and not characters (e.g. it was never
decoded when Perl read the data).

The web server that you are linking to has to assume that the links
may be in utf8, too.

Bill Moseley.
Sent from my iMutt
Users mailing list
Received on Wed May 13 16:19:17 2009