PhpDig.net (http://www.phpdig.net/forum/index.php)
-   External Binaries (http://www.phpdig.net/forum/forumdisplay.php?f=36)

 Charter 04-09-2004 09:24 AM

External Binaries Problem Checklist

This checklist includes most external binaries related issues pertaining to PhpDig version 1.6.4+ but is not meant to be absolutely exhaustive. If you are experiencing an external binaries related problem, then read through this checklist.
• If receiving a "call to undefined function: is_executable" error or using PHP < 5.0.0 on a Win system, set define('USE_IS_EXECUTABLE_COMMAND','0'); in the config file.
• Check that the directories to the external binary and the external binary itself are set to 755 permissions if applicable.
• Check that the following directories are set to 777 permissions if applicable:
- [PHPDIG_DIR]/text_content
- [PHPDIG_DIR]/includes (can be set to 755 after connect.php is configured)
• If using for example pdftotext, make sure define('PHPDIG_PDF_EXTENSION','.txt'); includes the period in the .txt extension.
• If using for example pstotext, make sure Ghostscript is installed correctly, version 3.33+ for PS files or version 3.51+ for PDF files.
• Set the correct path, for example define('PHPDIG_PARSE_PDF','/path/to/pdftotext'); on *nix or define('PHPDIG_PARSE_PDF','C:\\path\\to\\pdftotext'); on Win (may need .exe extension on Win).
• If not sure of the path, run the external binary from command line first and try that path.
• Use a path that does not include spaces, periods, or other 'special' characters.
• Check to make sure that safe_mode is set to off and allow_url_fopen is set to on.
• If an open_basedir restriction is in place, make sure to stick the files in the correct directory.
• If indexing from command line, make sure register_argc_argv is on or check this thread.
• If not sure about safe_mode, allow_url_fopen, open_basedir, or register_argc_argv, check your phpinfo page.
• Set define('LIMIT_DAYS',0); to allow for immediate reindex or check this thread.
• Contact the authors of the external binaries if you have trouble compiling and/or installing those programs.
• Still having problems...

Try the below code, modifying the code for other binaries if necessary, do another index, and post the results in your own thread:

First try the following and then reindex.

In robot_functions.php, find the appropriate $command variable: PHP Code:  // it can have _PDF or _MSWORD or _MSEXCEL depending on binary$command = PHPDIG_PARSE_PDF.' '.PHPDIG_OPTION_PDF.' '.$tempfile2;  And change to the following to see if the issue is displayed upon reindex: PHP Code:  // it can have _PDF or _MSWORD or _MSEXCEL depending on binary$command = PHPDIG_PARSE_PDF.' '.PHPDIG_OPTION_PDF.' '.$tempfile2.' 2>&1';  If that didn't help, then try the following and reindex. In spider.php, add the following echo statements: PHP Code:  // sets$tempfile and $tempfilesize /*****/ echo "<br><br>Is result test http an array: " . is_array($result_test_http) . "<br>"; echo "What is result test http status: " . $result_test_http['status'] . "<br>"; /*****/ extract(phpdigTempFile($url_indexing,$result_test_http,$relative_script_path.'/admin/temp/'));  
In robot_functions.php, add the following echo statements:
PHP Code:

 function phpdigTempFile($uri,$result_test,$prefix='temp/',$suffix1='1.tmp',$suffix2='2.tmp') { /*****/ echo "<br>Is result test an array: " . is_array($result_test) . "<br>"; echo "What is result test status: " . $result_test['status'] . "<br>"; echo "Use is executable is set to: " . USE_IS_EXECUTABLE_COMMAND . "<br>"; // in the next four lines change _PDF to either _MSWORD or _MSEXCEL for those binaries echo "Index the pdf is set to: " . PHPDIG_INDEX_PDF . "<br>"; echo "Parse the pdf is set to: " . PHPDIG_PARSE_PDF . "<br>"; echo "Does parse pdf exist: " . file_exists(PHPDIG_PARSE_PDF) . "<br>"; echo "Is parse pdf executable: " . is_executable(PHPDIG_PARSE_PDF) . "<br>"; /*****/ //$temp_filename = md5(time()+getmypid()).$suffix;  Also in robot_functions.php, add the following echo/print statements: PHP Code:  exec($command,$result,$retval); /*****/ echo "<br>Command is: " . $command . "<br>"; echo "Result contains: "; print_r($result); echo "<br>Return value is: " . $retval . "<br><br>"; /*****/ unlink($tempfile2);  
Remember to remove any "word" wrapping in the above code.

 All times are GMT -8. The time now is 06:33 AM.