Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Output not readable

From: Saubhagya Srivastava <saubhagya777(at)not-real.gmail.com>
Date: Mon Sep 29 2008 - 09:37:12 GMT
Hi,

Please find the output file and the config file attached.

I'm wondering how come the output is in Japanese characters (I open it in
unicode word processor to see it, I use SC Unipad), which is not set
anywhere in my regional settings.

Please let me know the possibilities.

Thanks and Regards,
Saubhagya



On Tue, May 20, 2008 at 4:00 AM, Bill Moseley <moseley@hank.org> wrote:

> On Mon, May 19, 2008 at 03:55:15PM +0530, Saubhagya Srivastava wrote:
> > Hi,
> >
> > I suppose the http://validator.w3.org/. validates for XHTML, my HTML
> files
> > can easily open in browser and the tags are intact. The
> > http://validator.w3.org/.gives an error even for many sites which are
> > perfectly running.
>
> Browsers are forgiving.  Doesn't mean your markup is not right.
>
>
> > 1. Output not readable, do we have some program to read that output?
>
> What output?
>
> > 2. Cannot index website, says : No such file or directory .
>
> That means you are not telling it a valid path.
>
>
> > If possible send me a complete example.
>
> You already have a complete example.  Did you check the examples in
> the documentation?
>
> I believe the INSTALL document has a few examples.  The list archives
> have many, I'm sure.
>
>
> > PROBLEM DESCRIPTION is as follows :
> >
> > C:\SWISH-E\bin>swish-e.exe -c Config_http.txt
> >
> > Indexing Data Source: "File-System"
> >
> > Indexing "http://www.download.com"
> >
> > Warning: Invalid path 'http://www.download.com': No such file or
> directory
>
> You need to read the documentation again.  You are telling Swish to
> index a file called http://www.download.com.  You don't have a *file*
> called that on your computer.
>
> >
> > Another config.txt is attached that indexes from html files but it gives
> > some tags errors like " ; " is expected at some places in html file,
> whereas
> > the html file is perfect and opens in browser. Following is the output:
>
> The html file is not perfect because it's giving you those messages:
>
>
> >   Guide.html - Using DEFAULT (HTML2) parser -
> D:HTML_files/Guide.html:1374:
> > error: htmlParseEntityRef: expecting ';'
> >
> >                 write &lt;&lt SettingsVersion &lt;&lt volume &lt;&lt
> > balance;
>
>   &lt without the ";" is not a valid entity.
>
> But, that's not preventing swish from indexing (you will just end up
> with a word spelled "lt" in your index.
>
> I'd recommend validating and fixing your html.  But if you don't feel
> able then you can change the logging level to suppress that error.
> See "ParseWarnLevel" in SWISH-CONFIG.
>
> >
> > *In this case the output generated is not readable; it has something like
> > Japanese characters. This is the main problem.*
>
> What output?  Maybe your terminal's charset does not match the output
> from swish?
>
>
>
>
> --
> Bill Moseley
> moseley@hank.org
>
> Unsubscribe from or help with the swish-e list:
>   http://swish-e.org/Discussion/
>
> Help with Swish-e:
>   http://swish-e.org/current/docs
>
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users
>


_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Mon Sep 29 09:52:23 2008