Skip to main content.
home | support | download

Back to List Archive

Re: swishspider > jswish > No preview available

From: Bryan Heidorn <heidorn(at)not-real.alexia.lis.uiuc.edu>
Date: Wed Apr 05 2000 - 20:52:36 GMT
At 06:28 AM 4/5/00 -0700, Benedict wrote:
>Question:
>
>using swishspider + jswish = No preview available. ?
>
>using filesystem + jswish = O.K.
>
>Any suggestion
>
>Thank You
>
>Benedict LU
> 

A student on one of my classes wrote this patch for that problem.
He said it was OK to post it.

> jkim39@alexia.lis.uiuc.edu wrote: 
Hi. 

I got an question from a classmate about how to make the preview function work
with URL type links. 

Here is my solution. I modified the "head" subroutine in the cgi script. The
solution is basically making the "head" subroutine use two different open
method for different type of links (regualar file path and HTTP URL). 

The new head subroutine uses the "open" function for regular file paths (this
part is already in the script) and the "get" function for HTTP urls. 
The "get" function is a part of the LWP:Simple package. 

Here is my code: 

use CGI qw(:standard); 

# JS begin 
# this package is needed to create a preview for a HTML file 
use LWP::Simple; 
# JS end 

# JS begin 
# Modified subroutine 
# if the link is start with "http", this routine assume the link is URL, 
# and open the link using the HTTP get method instead of the regular file open
$ 
sub head 
{ 
my($houtput); 
my($link, $content) ; 

$houtput = "" 
$link = $_[0] ; 

if ( $link =~ /\Ahttp/i) 
{ 
if (!defined ($content = get $link)) 
{ 
$houtput = "Can't open $_[0].\n Please notify $swishiadmin.\n" ; 
} 
else 
{ 
$houtput = substr ($content, 0, 1000) ; 
} 
} 
else 
{ 
if (!open(input, "< $_[0]")) 
{ 
# You can handle this how you like - this is the case when a 
# file is not readable because of the world read permission, 

# yet it has been included in the index. Either take it out 
# of the index or get your permissions striaghtened out! 

# I prefer that people just tell me about it. 

$houtput = "Can't open $_[0].\n Please notify $swishiadmin.\n" 
} 
else 
{ 
# read input, $houtput, 2000; 
read input, $houtput, 1000; 
} 
close (input); 
} 
return ($houtput); 
} 
# JS end 

Jisung. 
---------------------
heidorn (P Heidorn) wrote: 

There is a problem with using the preview function in search.cgi for http
indexes. Search.cgi tries to replace the string "REPLACE PREVIEW" with the
first line of the file listed in the "REPLACE LINK" variable. This workd fine
for local files system indexes but search.cgi does not have the smarts to read
files beginning with "http:" from the web rather than the native disk. 

There are several options to "fix" it. 

The easiest, but not prettiest is to remove the replacement variable, REPLACE
PREVIEW. 

A little better is to modify search.cgi very slightly not to print the error
message when a file can not be opened. 
In my file it is line 515 of search.cgi. It is 
$houtput = "Can't open #_[0].\n Please notify $swishiadmin.\n" 
Just comment it out. 

An even better option would be to make search.cgi web-able an then check the
$link variable with the name to see if it is an http link. If so use an http
read. There are perl libraries installed on bibiana to do this. Examples
are in
the swish source directory spider file but I don't think I'll get to do that
modification this week. 




--
--------------------------------------------------------------------
  P. Bryan Heidorn    Graduate School of Library and Information Science
  pheidorn@uiuc.edu   University of Illinois at Urbana-Champaign
  (V)217/ 244-7792    501 East Daniel St., Champaign, IL  61820-6212
  (F)217/ 244-3302    http://alexia.lis.uiuc.edu/~heidorn 

Information Retrieval System Design Principle # 1
"If you don't know something, ask someone who does. 
If that does not work, as a last resort, use an 
information retrieval system"  
Received on Wed Apr 5 16:55:04 2000