Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Identity-h encoding to regular text question regarding Swish-e

From: Flynn, Peter <"Flynn,>
Date: Wed, 14 May 2014 11:03:21 +0000
On 13/05/14 22:47, Michael Lopez wrote:
> To Whom It May Concern:
> I am currently working on this project where I am running this program
> called Swish-e. This is used to index files. I have noticed that it is
> only able to index certain PDF files but not PDF files that are Chinese
> for example.
> I am using Nitro PDF3 reader to read my PDF files if that makes any
> difference.
> What I would like to know is what would be the best Linux command to use
> to convert PDF files that are in Identity-h encoding to regular text
> files? Is there even a way to do this?

AFAIK Identity-H is a non-standard character encoding used by Adobe to 
represent languages which have very large numbers of characters 
(Chinese, Japanese, Korean, etc).

I don't know any software that will convert them to a standard encoding 
such as UTF-8 or UTF-16.

Peter Flynn | Academic & Collaborative Technologies | University College 
Cork IT Services | ☎ +353 21 490 2609 | ✉ pflynn(at) | 🌍
Users mailing list
Received on Wed May 14 2014 - 11:03:26 GMT