Skip to main content.
home | support | download

Back to List Archive

Re: Swish-e Spider and Following Image-based Links

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Jan 10 2007 - 15:41:15 GMT
On Wed, Jan 10, 2007 at 07:22:16AM -0800, James wrote:
> > Show your work.
> 
> It's not necessarily "my work" or my site.

Look, you claim that the spider isn't following a link.  Support your
claim with an example that shows the problem.  You are just wasting my
time now.


> It's not the image links.  It's the image links without alt tags.

Again, on what basis are you making that up?

> <a href="http://mydomain.org/home/" target="_self"><img
> src="images/entersite.gif" alt="" width="129" height="15" border="0"
> /></a></td>
> 
> That's the best I can do.  Swish-e won't follow that.

Right.  You can't come up with a working example like the docs I
pointed to you suggest?  Don't you have a web server?  Oh, you don't
even need one:

moseley@bumby:~$ cat james.html 
<a href="holy_shit.html" target="_self">
<img src="images/entersite.gif" alt="" width="129" height="15" border="0"> />
</a>

moseley@bumby:~$ cat holy_shit.html 
It works!


moseley(at)not-real.bumby:~$ b/lib/swish-e/spider.pl default file:///home/moseley/james.html
b/lib/swish-e/spider.pl: Reading parameters from 'default'
Path-Name: file:///home/moseley/james.html
Content-Length: 129
Last-Mtime: 1168443376
Document-Type: html*

<a href="holy_shit.html" target="_self">
<img src="images/entersite.gif" alt="" width="129" height="15" border="0"> />
</a>



Path-Name: file:///home/moseley/holy_shit.html
Content-Length: 10
Last-Mtime: 1168443388
Document-Type: html*

It works!

Summary for: file:///home/moseley/james.html
     Connection: Close:   1  (1.0/sec)
Connection: Keep-Alive:   1  (1.0/sec)
           Total Bytes: 139  (139.0/sec)
            Total Docs:   2  (2.0/sec)
           Unique URLs:   2  (2.0/sec)
             text/html:   2  (2.0/sec)

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Wed Jan 10 07:41:17 2007