Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Searching remote mail archive problem

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Thu Mar 06 2008 - 15:10:13 GMT
Tian Xinchun wrote on 3/6/08 2:16 AM:
> Hi Bill,
> 
> Thanks for your help, See below.
> 
>> ------------------------------
>>
>> Message: 6
>> Date: Wed, 5 Mar 2008 06:11:42 -0800
>> From: Bill Moseley <moseley@hank.org>
>> Subject: Re: [swish-e] Searching remote mail archive problem
>> To: Swish-e Users Discussion List <users@lists.swish-e.org>
>> Message-ID: <20080305141142.GA6428@hank.org>
>> Content-Type: text/plain; charset=utf-8
>>
>> On Wed, Mar 05, 2008 at 08:03:06PM +0800, Tian Xinchun wrote:
>>> Hi Peter?
>>>
>>> I am sorry that I can not quite understand what you mean. Taking a example:
>>>
>>> $swish-e -c swish.conf -S prog
>>> Indexing Data Source: "External-Program"
>>> Indexing "spider.pl"
>>> External Program found: /usr/local/lib/swish-e/spider.pl
>>> /usr/local/lib/swish-e/spider.pl: Reading parameters from 'spider.conf'
>>> https://www.lbl.gov/lists.archives/theta13-eng.archive/:1: error:
>>> htmlParseStartTag: invalid element name
>>> <?xml version="1.0" encoding="ISO-8859-1"?>
>>>  ^
>>> https://www.lbl.gov/lists.archives/theta13-eng.archive/:2: error: Misplaced
>>> DOCTYPE declaration
>>> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
>>> ^
>> You have two errors.  That first one above is simply saying you are
>> trying to index an xml document with Libxml's *html* parser.
>> So you need to use the XML* parser type.
>>
> 
> Actually, I have tried using XML*, but I still got the same error messages.


> Thanks for the information, and any plan on fixing it.
> 

If you can provide us with a small, reproduce-able test case, then we can 
attempt to fix the problem.

An example document and config file is all you should need to send.
-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Mar 6 10:10:17 2008