PDA

View Full Version : indexing pdf


philippeguerind
02-20-2004, 03:46 PM
Hi from France,
You will excuse my english. I can't make phpdig indexing pdf files.
I put the following lines into the config.php file.

define('PHPDIG_INDEX_PDF',true);
define('PHPDIG_PARSE_PDF','./pdftotext');
define('PHPDIG_OPTION_PDF','');
define('PHPDIG_PDF_EXTENSION','.txt');

as pdftotext.exe is located at the root. The indexing works perfectly with html files even with ASCII files but not with pdf files. My web site is located on Lycos server.
I uploaded pdftotext.exe at the root, then set permissions to 755. When I run phpdir from the administration panel asking to dig a pdf file indicating the full path, I get a green sign in front indicating the file is indexed. When I search for any word inside the pdf file I get no record.
What could I try? I have been looking at this Forum for weeks before posting. Now I have no more ideas.
Thanks for helping a novice.
Philippe.

tomas
02-20-2004, 04:04 PM
hi philippe,

is your server running on windows or unix/linux ?

tomas

philippeguerind
02-20-2004, 04:21 PM
Thank's. Lycos Servers are running Unix. I use 1.6.4 phpdig version.
Philippe

Charter
02-20-2004, 04:27 PM
Hi. Does Lycos allow commands such as exec (http://www.php.net/manual/en/function.exec.php) to run on its servers?

philippeguerind
02-20-2004, 04:36 PM
Hi, I don't know. I just asked their support service by posting a thread. I'm waiting for the answer ...
Philippe

philippeguerind
02-20-2004, 04:49 PM
If running exec is not allowed, is there any wy I could run pdftotext onto my PC as a shell ?
Philippe

Charter
02-20-2004, 05:38 PM
Hi. Perhaps check at the following link for a version that would work with your PC:

http://www.foolabs.com/xpdf/download.html

tomas
02-20-2004, 05:56 PM
hello,

philippe - try this setting:
define('PHPDIG_PDF_EXTENSION','');

run spider and take a look into text_content directory -
are there temp-files? are they empty?

after this test reset to:
define('PHPDIG_PDF_EXTENSION','.txt');

what is your servers php-version?

tomas

Charter
02-20-2004, 06:02 PM
OT: Thanks tomas for helping! :D

tomas
02-20-2004, 06:24 PM
hello again philippe,

in your first post you wrote "pdftotext.exe" -
it seems that you installed the dos-version on an unix-server???

the unix download is:
http://www.foolabs.com/xpdf/download.html
x86, Linux (glibc 2.2, staticly linked to Motif, t1lib, and FreeType 2):
xpdf-3.00-linux.tar.gz (4544077 bytes)

tomas

philippeguerind
02-21-2004, 05:58 AM
I wasn't using the unix version of pdftotext.
Now I do. is the line below is correct? as www is my root. It still doesn't work but I still go on ...

define('PHPDIG_PARSE_PDF','./usr/local/bin/pdftotext');
Philippe

tomas
02-21-2004, 10:50 AM
hi philippe,

i don't think so -
please try this:
1) upload: pdftotext binary into the same folder where phpdig is
2) set: 755 permissions for pdftotext and admin/temp
3) set: define('PHPDIG_PARSE_PDF','/path/to/pdftotext');
4) set: define('PHPDIG_PDF_EXTENSION','');

run spider and take a look into text_content directory -
are there temp-files? are they empty?

kind regards
tomas