PDA

View Full Version : spider.php blocked when indexing


acti_dev
11-28-2006, 02:24 AM
Hello,
I've installed phpdig v.1.8.8 with EasyPhp on Windows.

I would like to index pdf file.
I've added the 3 part of code in "read me before..."

When i try to index pdf files, it blocks

SITE : http://192.168.1.28/
Chemins exclus :
- @NONE@


Is result test http an array: 1
What is result test http status: HTML

Is result test an array: 1
What is result test status: HTML
Use is executable is set to: 1
Index the pdf is set to: 1
Parse the pdf is set to: //.../phpdig/xpdf/pdftotext.exe
Does parse pdf exist: 1

Thanks for your help

acti_dev
11-30-2006, 02:12 AM
When I comment this line //echo "Is parse pdf executable: " . is_executable(PHPDIG_PARSE_PDF) . "<br>";

I obtain this result :
SITE : http://192.168.1.28/
Chemins exclus :
- @NONE@


Is result test http an array: 1
What is result test http status: PDF

Is result test an array: 1
What is result test status: PDF
Use is executable is set to: 1
Index the pdf is set to: 1
Parse the pdf is set to: //.../phpdig/xpdf/pdftotext.exe
Does parse pdf exist: 1

Command is: //.../phpdig/xpdf/pdftotext.exe ../admin/temp/69288482.tmp 2>&1
Result contains: Array ( [0] => Error: Couldn't open file '../admin/temp/69288482.tmp' )
Return value is: 1

1:http://192.168.1.28/espace-dpi/directives/dir117.pdf
(temps : 00:00:01)
Pas de liens dans la table temporaire

And i have a tmp file which its name is 69288481.tmp (1ko) and not 69288482.tmp

Charter
12-02-2006, 06:20 AM
What did you set in the config file for the following?

PHPDIG_INDEX_PDF
PHPDIG_PARSE_PDF
PHPDIG_OPTION_PDF
PHPDIG_PDF_EXTENSION

acti_dev
12-04-2006, 05:17 AM
define('PHPDIG_INDEX_PDF',true);
define('PHPDIG_PARSE_PDF','\\\\..\\..\\phpdig\\xpdf\\pdftotext.exe');
define('PHPDIG_OPTION_PDF','');
define('PHPDIG_PDF_EXTENSION','.txt');

Charter
12-04-2006, 06:05 AM
Did "Is parse pdf executable" come out as zero or blank or one? If it was zero or blank, try setting the PHPDIG_PARSE_PDF constant in the config file to the full server path instead of using a relative path. Also if you are not using PHP5, set the USE_IS_EXECUTABLE_COMMAND constant in the config file to the number zero.

acti_dev
12-04-2006, 07:59 AM
I'm not using PHP5 so define('USE_IS_EXECUTABLE_COMMAND','0'); and it comes out blank.
I obtain the display of my 2nd post when I comment this line //echo "Is parse pdf executable: " . is_executable(PHPDIG_PARSE_PDF) . "<br>";
PHPDIG_PARSE_PDF is already a full server path (i'm not working on the server machine)

Charter
12-04-2006, 04:39 PM
Try running pdftotext.exe dir117.pdf from command prompt. Does it work? Are you able to index non-PDF files/HTML pages?

acti_dev
12-06-2006, 12:05 AM
pdftotext.exe runs well from dos command, it's from php it doesn't work and when i run a .bat file from php, a dos windows open and close but no txt file is created...

acti_dev
12-07-2006, 12:04 AM
And it doesn't work also with doc or xls files with catdoc or antiword. Only indexing of HTML pages works fine...

Charter
12-09-2006, 06:29 AM
If HTML pages are indexed, but not DOC, PDF, PPT, or XLS files, then it seems that EasyPHP might not be allowing the PHP exec function:
exec($command,$result,$retval);
I'm not familiar with EasyPHP, but perhaps the user comments on this (http://www.php.net/manual/en/ref.exec.php) page might help. Also, try to get the following script to run in EasyPHP:

<?php
echo exec('whoami');
?>