Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Swish-e not indexing doc or PDF files

From: Liam Buchanan <Liam.Buchanan(at)not-real.dtrdi.qld.gov.au>
Date: Wed Feb 27 2008 - 06:17:31 GMT
Hi,
I just tried indexing a pdf from a url (.cfm page) link on the local
machine and got these errors:

Accept-Ranges: bytes
ETag: "0315492fcfec21:58ec"
Server: Microsoft-IIS/5.0
Content-Length: 2660012
Content-Type: application/pdf
Last-Modified: Thu, 10 Apr 2003 01:00:26 GMT
Client-Date: Wed, 27 Feb 2008 06:14:24 GMT
Client-Peer: 127.0.0.1:5865
Client-Response-Num: 1
X-Powered-By: ASP.NET

^^^^^^^^^^^^^^^ END HEADERS ^^^^^^^^^^^^^^^^^^^^^^^^^^

>> +Fetched 1 Cnt: 2 GET
http://172.16.100.241/dsdweb/v3/guis/templates/content
/Errmsg.pdf  200 OK application/pdf 2660012
parent:http://172.16.100.241/dsdweb/
v3/guis/templates/content/testpage.cfm depth:1
http://172.16.100.241/dsdweb/v3/guis/templates/content/testpage.cfm -
Using HTML
2 parser -  (54 words)
http://172.16.100.241/dsdweb/v3/guis/templates/content/Errmsg.pdf -
Using HTML2
parser - Error (0): PDF file is damaged - attempting to reconstruct xref
table..
.
Error: Top-level pages object is wrong type (null)
Error: Couldn't read page catalog
 (no words indexed)

-------------

Can anyone suggest the issue here?

Thanks !!!

 

-----Original Message-----
From: users-bounces@lists.swish-e.org
[mailto:users-bounces@lists.swish-e.org] On Behalf Of Peter Karman
Sent: Saturday, 23 February 2008 3:45 AM
To: Swish-e Users Discussion List
Subject: Re: [swish-e] Swish-e not indexing doc or PDF files



On 02/20/2008 07:50 PM, Liam Buchanan wrote:
> Hi,
> Thanks for the information.
> I tried to do a trace but it didn't come up with anything unusual.
> 
> Below is my spider.pl file conf
> Please let me know if there is anything in there I am missing or 
> should be taken out. The proxy reference needs to be in there for it
to work.


I guess I'm not following you. The example you gave works? What is a
'trace'?

--
Peter Karman  .  peter(at)not-real.peknet.com  .  http://peknet.com/

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users

---------------------------------------------------------------------------- 
Unless stated otherwise, this email, together with any attachments, is 
intended for the named recipient(s) only and may contain privileged and 
confidential information. If received in error, you are asked to inform the 
sender as quickly as possible and delete this email and any copies of this 
from your computer system network. 

If not an intended recipient of this email, you must not copy, distribute or 
take any action(s) that relies on it; any form of disclosure, modification, 
distribution and/or publication of this email is also prohibited. 

Unless stated otherwise, this email represents only the views of the sender 
and not the views of the Queensland Government. 
----------------------------------------------------------------------------
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Wed Feb 27 01:17:37 2008