PDA

View Full Version : Can phpdig index Japanese PDF file???


mynamesucks
02-16-2005, 02:38 AM
Hi,

I converted a PDF file to TXT file thouth pdftotext in linux commend line.
Following is the commend:
./pdftotext test.pdf test.txt
And then I can get the TXT file.

But when I converted a PDF file written with Japanese, it will occur a problem.
Pdftotext just converted the English and numbers except Japanese.
After converting a Japanese Pdf file, I just got a blank TXT file.

Then I tried to set encode in commend:
./pdftotext -enc Shift-JIS test.pdf test.txt
It will display:
Error: Couldn't find unicodeMap file for the 'Shift-JIS' encoding
Error: Couldn't get text encoding

Can anyone tell me what should I do next?
Thanks indeed.
Waiting for your help!!! :cry: :cry: :cry: :cry: :cry: :cry:

mynamesucks
02-16-2005, 02:39 AM
Hi,

I converted a PDF file to TXT file thouth pdftotext in linux commend line.
Following is the commend:
./pdftotext test.pdf test.txt
And then I can get the TXT file.

But when I converted a PDF file written with Japanese, it will occur a problem.
Pdftotext just converted the English and numbers except Japanese.
After converting a Japanese Pdf file, I just got a blank TXT file.

Then I tried to set encode in commend:
./pdftotext -enc Shift-JIS test.pdf test.txt
It will display:
Error: Couldn't find unicodeMap file for the 'Shift-JIS' encoding
Error: Couldn't get text encoding

Can anyone tell me what should I do next?
Thanks indeed.
Waiting for your help!!! :cry: :cry: :cry:

Charter
02-17-2005, 05:27 AM
If you FTP the external binary itself without configuring any options, pdftotext doesn't know what to do with Japanese. See this (www.itmedia.co.jp/help/tips/linux/l0678.html) (in Japanese).

mynamesucks
02-22-2005, 09:59 PM
Thanks Charter