PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   External Binaries (http://www.phpdig.net/forum/forumdisplay.php?f=36)
-   -   README before posting (http://www.phpdig.net/forum/showthread.php?t=799)

Charter 04-09-2004 09:24 AM

README before posting
 
External Binaries Problem Checklist

This checklist includes most external binaries related issues pertaining to PhpDig version 1.6.4+ but is not meant to be absolutely exhaustive. If you are experiencing an external binaries related problem, then read through this checklist.
  • If receiving a "call to undefined function: is_executable" error or using PHP < 5.0.0 on a Win system, set define('USE_IS_EXECUTABLE_COMMAND','0'); in the config file.
  • Check that the directories to the external binary and the external binary itself are set to 755 permissions if applicable.
  • Check that the following directories are set to 777 permissions if applicable:
    - [PHPDIG_DIR]/text_content
    - [PHPDIG_DIR]/includes (can be set to 755 after connect.php is configured)
    - [PHPDIG_DIR]/admin/temp
  • If using PHP version 4.2.2/3, check this thread or upgrade your PHP.
  • If using for example pdftotext, make sure define('PHPDIG_PDF_EXTENSION','.txt'); includes the period in the .txt extension.
  • If using for example pstotext, make sure Ghostscript is installed correctly, version 3.33+ for PS files or version 3.51+ for PDF files.
  • Set the correct path, for example define('PHPDIG_PARSE_PDF','/path/to/pdftotext'); on *nix or define('PHPDIG_PARSE_PDF','C:\\path\\to\\pdftotext'); on Win (may need .exe extension on Win).
  • If not sure of the path, run the external binary from command line first and try that path.
  • Use a path that does not include spaces, periods, or other 'special' characters.
  • Check to make sure that safe_mode is set to off and allow_url_fopen is set to on.
  • If an open_basedir restriction is in place, make sure to stick the files in the correct directory.
  • If indexing from command line, make sure register_argc_argv is on or check this thread.
  • If not sure about safe_mode, allow_url_fopen, open_basedir, or register_argc_argv, check your phpinfo page.
  • Set define('LIMIT_DAYS',0); to allow for immediate reindex or check this thread.
  • Contact the authors of the external binaries if you have trouble compiling and/or installing those programs.
  • Still having problems...

    Try the below code, modifying the code for other binaries if necessary, do another index, and post the results in your own thread:

    First try the following and then reindex.

    In robot_functions.php, find the appropriate $command variable:
    PHP Code:

    // it can have _PDF or _MSWORD or _MSEXCEL depending on binary
    $command PHPDIG_PARSE_PDF.' '.PHPDIG_OPTION_PDF.' '.$tempfile2

    And change to the following to see if the issue is displayed upon reindex:
    PHP Code:

    // it can have _PDF or _MSWORD or _MSEXCEL depending on binary
    $command PHPDIG_PARSE_PDF.' '.PHPDIG_OPTION_PDF.' '.$tempfile2.' 2>&1'

    If that didn't help, then try the following and reindex.

    In spider.php, add the following echo statements:
    PHP Code:

    // sets $tempfile and $tempfilesize

    /*****/
    echo "<br><br>Is result test http an array: " is_array($result_test_http) . "<br>";
    echo 
    "What is result test http status: " $result_test_http['status'] . "<br>";
    /*****/

    extract(phpdigTempFile($url_indexing,$result_test_http,$relative_script_path.'/admin/temp/')); 

    In robot_functions.php, add the following echo statements:
    PHP Code:

    function phpdigTempFile($uri,$result_test,$prefix='temp/',$suffix1='1.tmp',$suffix2='2.tmp') {

    /*****/
    echo "<br>Is result test an array: " is_array($result_test) . "<br>";
    echo 
    "What is result test status: " $result_test['status'] . "<br>";
    echo 
    "Use is executable is set to: " USE_IS_EXECUTABLE_COMMAND "<br>";
    // in the next four lines change _PDF to either _MSWORD or _MSEXCEL for those binaries
    echo "Index the pdf is set to: " PHPDIG_INDEX_PDF "<br>";
    echo 
    "Parse the pdf is set to: " PHPDIG_PARSE_PDF "<br>";
    echo 
    "Does parse pdf exist: " file_exists(PHPDIG_PARSE_PDF) . "<br>";
    echo 
    "Is parse pdf executable: " is_executable(PHPDIG_PARSE_PDF) . "<br>";
    /*****/

    // $temp_filename = md5(time()+getmypid()).$suffix; 

    Also in robot_functions.php, add the following echo/print statements:
    PHP Code:

    exec($command,$result,$retval);

    /*****/
    echo "<br>Command is: " $command "<br>";
    echo 
    "Result contains: ";
    print_r($result);
    echo 
    "<br>Return value is: " $retval "<br><br>";
    /*****/

    unlink($tempfile2); 

    Remember to remove any "word" wrapping in the above code.


All times are GMT -8. The time now is 04:17 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.