PDA

View Full Version : pdf indexing blocks when spidering


sepult
06-19-2006, 11:32 AM
Hi,

First, congratulations for PhpDig !

So i've installed phpdig v.1.8.9 RC1 on my localhost. All is ok.
I would like to index pdf file.
I've added the 3 part of code in "read me before..."

When i try to index pdf files, it blocks at "echo is_executable" line, see commentar :

Is result test http an array: 1
What is result test http status: PDF

Is result test an array : 1
What is result test status : PDF
Use is executable is set to : 0
Index the pdf is set to : 1
Parse the pdf is set to : D:\phpdig\ext\pdftotext.exe
Does parse pdf exist :

It blocks here, with no result.

I think the command is_executable doesn't work. Because it run on a windows server, i've tried to change the value
define('USE_IS_EXECUTABLE_COMMAND','0'); in config file

Could you please help me.
THX

sepult
06-19-2006, 12:18 PM
Config :

Server under Windows
PHP Version 4.4.0

sepult
06-20-2006, 03:45 AM
I saw that pascalp had the same problem.

How did you resolve it ?

sepult
06-29-2006, 05:24 AM
Finally, I resolved my problem.
Here is my solution.

I'm using Windows server with apache 2 and can't use command "is_executable", same for "cat" or "type" command shell.

So after robot functions :
if ($usetool) {
rename($tempfile1,$tempfile2);
exec($command,$result,$retval);
unlink($tempfile2);

add :
$f_handler = fopen($tempfile2.$ext,'r');
while (!feof($f_handler)) {
$result[] .= utf8_encode(fgets($f_handler,8192));
}
fclose($f_handler);
unlink($tempfile2.$ext);

and comment lines after :

/*if (!empty($ext)) {
$command = 'type '.$tempfile2.$ext;
exec($command,$result,$retval);
unlink($tempfile2.$ext);
}*/

So for me it works fine.
Bye