PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   External Binaries (http://www.phpdig.net/forum/forumdisplay.php?f=36)
-   -   no indexing with catdoc and xls2csv (http://www.phpdig.net/forum/showthread.php?t=769)

Kylord 04-01-2004 10:58 PM

no indexing with catdoc and xls2csv
 
hello,

well I have a problem with catdoc and xls2csv (on linux system) :
I have set correctly the path, and change all variables in config.php like this :

define('PHPDIG_INDEX_MSWORD',true);
define('PHPDIG_PARSE_MSWORD','/usr/local/httpd/cgi-bin/catdoc');
define('PHPDIG_OPTION_MSWORD','-s 8859-1');

define('PHPDIG_INDEX_MSEXCEL',true);
define('PHPDIG_PARSE_MSEXCEL','/usr/local/httpd/cgi-bin/xls2csv');
define('PHPDIG_OPTION_MSEXCEL','');

but phpdig doesn't index these files.
it's strange because with pdf (with pdftotext) , I have no problem

I wonder if it is because catdoc and xls2csv return on STDOUT
whereas pdftotext writes in a file .txt.

the path of catdoc and xls2csv are symbolic links but I don't think it's a problem. By the way, when I execute catdoc on line command, it works normally.

What can I do to solve this problem ?

Kylord 04-05-2004 01:23 AM

well i've added some lines in robot_functions.php to find the problem. here they are :

echo $command . "<br>"; // try running this from shell in admin dir
print_r($result); // holds the output sent to STDOUT
echo "<br>" . $retval; // is zero if command succeeded

and when spidering comes through a pdf, a doc or a xls file, I can read this :
/usr/opt/www/juju/catdoc/bin/catdoc -s 8859-1 ../admin/temp/53771962.tmp
Array ( )

so it appears it returns nothing, even for pdf files (i had believed that it works for pdf files because of the green quote but actually it seems it doesn't)
its very strange, maybe the temp files aren't created ?

Charter 04-09-2004 07:19 AM

Hi. What version of PHP? Perhaps try this thread.


All times are GMT -8. The time now is 06:22 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.