PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Mod Submissions

Reply
 
Thread Tools
Old 09-13-2003, 01:35 PM   #1
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
1.6.2 fix to crawl binary files

This is a 1.6.2 temporary fix to crawl binary files. This fix is not included in the 1.6.2 download but will be improved upon and included in the next release.

First make a backup of the robot_functions.php file. Then in robot_functions.php, find the function phpdigTempFile. In the function phpdigTempFile, find the following:
PHP Code:
return array('tempfile'=>$tempfile,'tempfilesize'=>$tempfilesize); 
and replace with the following:
PHP Code:
    switch ($result_test['status']) {
         case 
'MSWORD':
         
$my_new_tempfile $tempfile;
         break;

         
//case 'MSEXCEL':
         //$my_new_tempfile = "<fill in>";
         //break;

         
case 'PDF':
         
$my_new_tempfile $tempfile."2.txt";
         break;

         default:
         
$my_new_tempfile $tempfile;
    }

return array(
'tempfile'=>$my_new_tempfile,'tempfilesize'=>$tempfilesize); 
It seems that $my_new_filename can be different depending on external binary defaults, something to work on for the next release. In the meantime, after crawling a binary file, go to the temp directory and look at the extention, modifying the above as necessary.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 09-16-2003, 06:52 PM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Here's an example of what's going on with external binaries.
  1. catdoc spits output to stdout so $result contains output
  2. pdftotext spits output to filename.txt so $result is empty
This means that if the external binary that you are using outputs to stdout, then there is no need to add the switch statement given in the previous post, as $result contains the necessary info for indexing the document.

However, if the external binary does not output to stdout but rather outputs to a file, and the document is not indexed, then check the file extension in the temp directory, modifying the switch statement as necessary.

EDIT: external binary process modified in version 1.6.4.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
catdoc MSWORD binary won't execute frodo External Binaries 0 06-22-2006 01:31 PM
Win 98 + Easyphp & binary problem sofos External Binaries 7 12-06-2004 12:33 AM
I wrote a mod for indexing pdf without an external binary!!! caco3 External Binaries 11 07-10-2004 12:08 PM
pstotext binary tomas External Binaries 2 02-12-2004 07:09 PM


All times are GMT -8. The time now is 11:32 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.