PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   External Binaries (http://www.phpdig.net/forum/forumdisplay.php?f=36)
-   -   problems with pstotext - path-problem? (http://www.phpdig.net/forum/showthread.php?t=2352)

jmeyerdo 01-25-2006 10:02 PM

problems with pstotext - path-problem?
 
I installed PhpDig yesterday (after several other search engines) for a webproject from a friend of mine - we are really impressed about this great tool!

Indexing of .html and .doc works fine - without problems.
Unfortunately there are problems with pdf-generation - and I could not figure it out in some hours yesterday.
This morning I could reproduce this error - and can not understand this very strange behaviour. So perhaps you might have an idea...

Operating system is Debian.
Spider finds .pdf without problems, but adds no information to database.

I checked manually on command line by SSH.
"pstotext originalpdf.pdf" works without problems.
Also "pstotext mydirectory/originalpdf.pdf" works fine.
But: If I change the path to a "higher directory" and must access the file with "../" - generation fails.
So with "pstotext ../mydirectory/originalpdf.pdf" (correct path):
Code:

gs -r72 -dNODISPLAY -dFIXEDMEDIA -dDELAYBIND -dWRITESYSTEMDICT  -dNOPAUSE -dSAFER  /tmp/ps2tQYryQK -- '../doctest/Suchmaschinentest2.pdf'
GPL Ghostscript 8.01 (2004-01-30)
Copyright (C) 2004 artofcode LLC, Benicia, CA.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
QI 100 0 0 -100 0 84200
Error: /invalidfileaccess in --.libfile--
Operand stack:
  (../doctest/Suchmaschinentest2.pdf)
Execution stack:
  %interp_exit  .runexec2  --nostringval--  --nostringval--  --nostringval--  2  %stopped_push  --nostringval--  --nostringval--  --nostringval--  false  1  %stopped_push  --nostringval--  1  3  %oparray_pop  --nostringval--  --nostringval--  --nostringval--
Dictionary stack:
  --dict:1051/1123(ro)(G)--  --dict:0/20(G)--  --dict:71/200(L)--
Current allocation mode is local
Last OS error: 2
GPL Ghostscript 8.01: Unrecoverable error, exit code 1

I can not understand this behaviour.
The same problem occurs when accessing the generated (and not deleted) tempfiles from commandline (/usr/bin/pstotext -cork ../admin/temp/49389132.tmp).

Is this a problem with pstotext?
I tried to work with full paths from "/" - but same error.

Any help or suggestions would be greatly appreciated...
Thank you, kind regards,
Jens

jmeyerdo 01-26-2006 11:30 AM

changing pstotext --> pdftotext - spider hangs up
 
My next steps...

I installed pdftotext now and changed PhpDig to use this tool.
Parsing PDFs is ok now - but unfortunately parser hangs up after parsing first pdfs. With next try he reads 1-2 more pdfs and hangs up again.
Unfortunately I can not see with which pdf the error occurs.
The file xxxxxxxx.tmp in admin/temp is 0 Bytes.

All pdfs in webdirectory are accessible with pdftotext directly.

Hmm, any suggestions?
Kind regards,
Jens


All times are GMT -8. The time now is 09:22 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.