PDA

View Full Version : Win 98 + Easyphp & binary problem


sofos
12-01-2004, 02:29 AM
Hello,
has some one managed to set up pdf indexation with Win 98 + EasyPHP (PHP Version 4.3.3) + latest phpdig + "pdftotext" binary ??
Difficult to check up with the advised "checklist", it seems that the php function "is_executable" doesnt work on php 4.3.3 (?)

I get this :


Is result test http an array: 1
What is result test http status: HTML

Is result test an array: 1
What is result test status: HTML
Use is executable is set to: 1
Index the pdf is set to: 1
Parse the pdf is set to: C:\Web\cgi-bin\pdftotext.exe
Does parse pdf exist: 1

Fatal error: Call to undefined function: is_executable() in c:\web\www\phpdig\admin\robot_functions.php on line 963

:bang:
And if I cut this line (to see whats going on next), it goes to :


Is result test http an array: 1
What is result test http status: HTML

Is result test an array: 1
What is result test status: HTML
Use is executable is set to: 1
Index the pdf is set to: 1
Parse the pdf is set to: C:\Web\cgi-bin\pdftotext.exe
Does parse pdf exist: 1
Doublon avec un document existant
1:http://localhost/Documentation/Revues/
(temps : 00:00:06)
+
niveau 1...


Is result test http an array: 1
What is result test http status: PDF

Is result test an array: 1
What is result test status: PDF
Use is executable is set to: 1
Index the pdf is set to: 1
Parse the pdf is set to: C:\Web\cgi-bin\pdftotext.exe
Does parse pdf exist: 1

Command is: C:\Web\cgi-bin\pdftotext.exe -cork ../admin/temp/62981782.tmp2>&1
Result contains: Array ( )
Return value is: 0

2:http://localhost/Documentation/Revues/Lisezmoi.pdf
(temps : 00:00:18)

Pas de liens dans la table temporaire


Thank's for your help ....

sofos
12-03-2004, 12:12 AM
OK, after several readings, I have solved the problem for pdf (I wasnt the only one apparenly and it was just the problem of the option line given to the binary) :)

I am now stucked on a similar problem, with "word" documents (I use Doc2txt for the text conversion) and whenever I try to index those documents, indexation is not done, but it remains :
a 98183891.tmp file in the admin/temp directory and a 98183892.txt file in the admin directory. By the way, this last one is the right text translation of the original Word document. The one in the temp directory contains only one line saying something like :
C:Web\www\phpdig\admin\temp\98183891.tmp --> "" "" \admin\98183892.txt

Has s.o experienced this ?

Thanks

Charter
12-03-2004, 04:43 AM
Try setting define('PHPDIG_MSWORD_EXTENSION',''); to define('PHPDIG_MSWORD_EXTENSION','.txt'); in the config file, making sure to have the period on the .txt extension.

sofos
12-03-2004, 06:34 AM
Hi Charter, That was already set like this (actually, I have duplicated the 'pdf' settings, except the name of the binary, of course).

Charter
12-03-2004, 06:46 AM
Try running Doc2Txt from shell and verify that filename.doc is output to filename.txt - maybe Doc2Txt outputs to filename.doc.txt or something else?

sofos
12-03-2004, 07:10 AM
I try this.
Just to make sure I have been clear enough, the text conversion seems to work just fine, since the 98183892.txt file in admin (which, I guess, was previoulsly in admin/temp) is the right text extraction on the right word doc.
Anyway, I try your advice and i 'll go back to you on monday.
Good week end

Charter
12-03-2004, 10:18 PM
OIC, so try changing:

$command = PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2;

to the following:

$command = PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2.' 2>&1';

and see what it says when you try to index a *.doc file.

sofos
12-06-2004, 01:33 AM
:) Hi, It works now. And I explain it in case some one uses Doc2txt also on a Windows + Easyphp configuration :
Actually, when using Doc2txt, it is necessary to precise in the options the three flags ( " Doc2txt -q -o /admin/temp -E txt filename.doc" )
"-q" to make it quiet,
"-o /admin/temp" to force the generated text file to be in the right directory,
"-E txt" to force the right extension : The problem was mainly here because Phpdig is expecting that a "filename.tmp" (the local copy of the given "filename.doc") will be translated by Doc2txt into a "filename.tmp.txt". But, if the flag -E is omitted, Doc2txt will generate "filename.txt" instead of "filename.tmp.txt".

So it's working and I really appreciate using phpdig !
I try now to set up crons for the scheduling of the indexation process : I hope Windows wil let me do that....

Thanks for your help,