Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] multiple Warnings: 'could not be encoded to charset 'ISO-8859-1'

From: at <Peter>
Date: Wed, 14 Mar 2012 08:29:35 -0500
Dr Michael Daly wrote on 3/14/12 7:40 AM:
> Maybe this is related to my previous problem, maybe not:

the .xls file errors probably are related.

> Whereby the content of web_1.conf is:
>  IndexDir spider.pl
>  SwishProgParameters default http://localhost:104
>  StoreDescription TXT 200
>  StoreDescription HTML <body> 200
> 
> invoking this via:
> # swish-e -S prog -c /share/MD0_DATA/swish-e-files/swish-e-conf/web_1.conf
> 
> outputs:
> Indexing Data Source: "External-Program"
> Indexing "spider.pl"
> External Program found: /opt/lib/swish-e/spider.pl
> Missing argument in sprintf at /opt/lib/swish-e/spider.pl line 38.
> Missing argument in sprintf at /opt/lib/swish-e/spider.pl line 38.
> /opt/lib/swish-e/spider.pl: Reading parameters from 'default'
> Warning: document 'http://localhost:104' could not be encoded to charset
> 'ISO-8859-1'

break it down to one file and see if you can isolate the problem. E.g. if you
can fetch http://localhost:104 and write its contents to a file and then index
that file directly with swish-e, then you know the problem is in the spider
config. If you can't index the file with swish-e, then you know the problem is
in your swish-e config and/or your document.

Encoding problems are common. Make sure your content is ISO-8859-1 or some other
single-byte encoding, or is UTF-8 and be prepared that swish-e will convert it
to 8859 internally when indexing.

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users(at)not-real.lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Wed Mar 14 2012 - 13:29:37 GMT