Skip to main content.
home | support | download

Back to List Archive

following xml relative links with spider.pl

From: Brian Ling <brian_ling_gandj(at)not-real.yahoo.com>
Date: Tue Jan 16 2007 - 14:11:00 GMT
Hi all,

I've just started using swish-e so sorry if this is a
bit newbie.

I want to index a subversion repository via it's
web/apache front end, but I can't seem to get
spider.pl to follow the links in the default
subversion output.

I'm calling the spider directly with:
/usr/local/lib/swish-e/spider.pl ./spider.conf it
finds and outputs the main subversion page (output at
end of mail) but doesn't follow any of the links.
Everything appeared to install OK. I'm on OS X 10.4.8
What am I missing?

spider.conf:
    @servers = (
        {
                email       => 'test@test.co.uk',
                base_url    =>
'http://localhost/svn/',
                same_hosts  => [ '127.0.0.1' ],
                use_default_config  => 1,
                link_tags   => [qw/ a frame dir /],
        },
    );
    1;

output from spider.pl:

/usr/local/lib/swish-e/spider.pl: Reading parameters
from './spider.conf'
Path-Name: http://localhost/svn/
Content-Length: 1232
Document-Type: xml*

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl"
href="/xslt/svnindex.xsl"?>
<!DOCTYPE svn [
  <!ELEMENT svn   (index)>
  <!ATTLIST svn   version CDATA #REQUIRED
                  href    CDATA #REQUIRED>
  <!ELEMENT index (updir?, (file | dir)*)>
  <!ATTLIST index name    CDATA #IMPLIED
                  path    CDATA #IMPLIED
                  rev     CDATA #IMPLIED>
  <!ELEMENT updir EMPTY>
  <!ELEMENT file  EMPTY>
  <!ATTLIST file  name    CDATA #REQUIRED
                  href    CDATA #REQUIRED>
  <!ELEMENT dir   EMPTY>
  <!ATTLIST dir   name    CDATA #REQUIRED
                  href    CDATA #REQUIRED>
]>
<svn version="1.3.0 (r17949)"
     href="http://subversion.tigris.org/">
  <index rev="170" path="/">
    <dir name="SubversionNotes"
href="SubversionNotes/" />
    <dir name="altirsCustomInventory"
href="altirsCustomInventory/" />
    <dir name="appsMan" href="appsMan/" />
    <dir name="artwork" href="artwork/" />
    <dir name="bootDVD-CD" href="bootDVD-CD/" />
    <dir name="docs" href="docs/" />
    <dir name="dtupdates" href="dtupdates/" />
    <dir name="localMachine" href="localMachine/" />
    <dir name="netlogon" href="netlogon/" />
    <dir name="tools" href="tools/" />
  </index>
</svn>

Summary for: http://localhost/svn/
Connection: Close:     1  (1.0/sec)
      Total Bytes: 1,232  (1232.0/sec)
       Total Docs:     1  (1.0/sec)
      Unique URLs:     1  (1.0/sec)

Thanks for any pointer,

Brian 


 
____________________________________________________________________________________
Now that's room service!  Choose from over 150,000 hotels
in 45,000 destinations on Yahoo! Travel to find your fit.
http://farechase.yahoo.com/promo-generic-14795097
Received on Tue Jan 16 06:11:17 2007