On 13/05/14 22:47, Michael Lopez wrote:
> To Whom It May Concern:
> I am currently working on this project where I am running this program
> called Swish-e. This is used to index files. I have noticed that it is
> only able to index certain PDF files but not PDF files that are Chinese
> for example.
> I am using Nitro PDF3 reader to read my PDF files if that makes any
> What I would like to know is what would be the best Linux command to use
> to convert PDF files that are in Identity-h encoding to regular text
> files? Is there even a way to do this?
AFAIK Identity-H is a non-standard character encoding used by Adobe to
represent languages which have very large numbers of characters
(Chinese, Japanese, Korean, etc).
I don't know any software that will convert them to a standard encoding
such as UTF-8 or UTF-16.
Peter Flynn | Academic & Collaborative Technologies | University College
Cork IT Services | ☎ +353 21 490 2609 | ✉ pflynn(at)not-real.ucc.ie | 🌍 www.ucc.ie
Users mailing list
Received on Wed May 14 2014 - 11:03:26 GMT