Skip to main content.
home | support | download

Back to List Archive

new version of swishspider

From: Ron Samuel Klatchko <rsk(at)not-real.brightmail.com>
Date: Fri Feb 18 2000 - 23:30:49 GMT
Okay, I have a new version of swishspider that defaults the content-type
to text/html when it is not set.  I've included a context diff in this
file and will send on the entire new swishspider in a separate message
as an attachment.  Note, this is my working version of swishspider so it
also contains the half the fix for content-type's with charset
information (the other half of that fix is in http.c and can be found on
the patches page).

*** swishspider Fri Feb 18 15:20:06 2000
--- swishspider.new     Fri Feb 18 15:19:44 2000
***************
*** 22,27 ****
--- 22,32 ----
  my $response = $ua->simple_request( $request );
  
  #
+ # Get the content-type and default to html if it isn't set
+ #
+ my $content_type = $response->header( "content-type" ) || "text/html";
+ 
+ #
  # Write out important meta-data.  This includes the HTTP code. 
Depending on the
  # code, we write out other data.  Redirects have the location printed,
everything
  # else gets the content-type.
***************
*** 29,35 ****
  open( RESP, ">$localpath.response" ) || die( "Could not open response
file $localpath.response" );
  print RESP $response->code() . "\n";
  if( $response->code() == RC_OK ) {
!     print RESP $response->header( "content-type" ) . "\n";
  } elsif( $response->is_redirect() ) {
      print RESP $response->header( "location" ) . "\n";
  }
--- 34,40 ----
  open( RESP, ">$localpath.response" ) || die( "Could not open response
file $localpath.response" );
  print RESP $response->code() . "\n";
  if( $response->code() == RC_OK ) {
!     print RESP $content_type . "\n";
  } elsif( $response->is_redirect() ) {
      print RESP $response->header( "location" ) . "\n";
  }
***************
*** 47,53 ****
      print CONTENTS $contents;
      close( CONTENTS );
  
!     if( $response->header("content-type") eq "text/html" ) {
        open( LINKS, ">$localpath.links" ) || die( "Could not open links
file $localpath.links\n" );
        $p = HTML::LinkExtor->new( \&linkcb, $url );
        $p->parse( $contents );
--- 52,58 ----
      print CONTENTS $contents;
      close( CONTENTS );
  
!     if( substr($content_type, 0, length("text/html")) eq "text/html" )
{
          open( LINKS, ">$localpath.links" ) || die( "Could not open
links file $localpath.links\n" );
          $p = HTML::LinkExtor->new( \&linkcb, $url );
          $p->parse( $contents );

moo
------------------------------------------------------------
           Ron Samuel Klatchko - Software Jester
            Brightmail Inc - rsk@brightmail.com
Received on Fri Feb 18 18:34:44 2000