PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > External Binaries

Reply
 
Thread Tools
Old 10-23-2003, 05:02 AM   #1
Tanasja
Green Mole
 
Join Date: Oct 2003
Location: Amsterdam
Posts: 9
catdoc

Hi,

I know that you don't give support on catdoc, but...
I have trouble getting it installed and looking for help

I searched all the relevant sites with Google, also
http://www.45.free.net/~vitus/ice/catdoc/
but they give only brieve information.

Where can I find a good manual or support?

Thanx,
Tanasja
Tanasja is offline   Reply With Quote
Old 10-24-2003, 04:17 PM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Just download the package that contains the executable, FTP the executable over to your site in binary mode, and then set define('PHPDIG_PARSE_MSWORD','/full/path/to/catdoc'); in the config file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 10-25-2003, 06:20 AM   #3
Tanasja
Green Mole
 
Join Date: Oct 2003
Location: Amsterdam
Posts: 9
Hi Charter,

Can I put catdoc in any directory?
Because I have no access to the suggested usr/local/bin directory

And what means "in binary mode"?

Thnx. Tanasja
Tanasja is offline   Reply With Quote
Old 10-25-2003, 06:40 AM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. FTP can allow files to be transferred in ASCII mode (e.g., for text files like HTML files) or BINARY mode (e.g., for graphic files like JPG files). Just FTP the executable catdoc like you would a graphics file.

Assuming that your host allows the execution of catdoc, you can put catdoc in any of your directories and call it from there using define('PHPDIG_PARSE_MSWORD','/full/path/to/catdoc'); in the config file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 10-28-2003, 01:58 AM   #5
Tanasja
Green Mole
 
Join Date: Oct 2003
Location: Amsterdam
Posts: 9
Hi Charter,

Phpdig functions oke, and I did like you told, but Catdoc is still not working.

This is what happens:
- When I re-index, no errors are given, only the comment: no link in termpory table.
- When I change PHPDIG_INDEX_MSWORD from false into true, the spider also shows the doc-files. (conlcusion: the spider finds and recognizes doc-files)
- When I change the catdoc direcotory name to a non-existing one, no error is given. (conclusion: spider does not ask for catdoc, here something goes wrong)

I tried it on my external host and local.

What can it be?

greetx, T
Tanasja is offline   Reply With Quote
Old 10-28-2003, 07:07 PM   #6
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Are there any files in the temp directory? If so, what's the extension?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 10-29-2003, 05:13 AM   #7
Tanasja
Green Mole
 
Join Date: Oct 2003
Location: Amsterdam
Posts: 9
Hi Charter,

Yes, I can see that in admin/temp temp files created and unlinked with the extension .tmp2.

Here more information:

I run PhpDig local on my PC.

In config.php I changed
('PHPDIG_PARSE_MSWORD','c://apache/htdocs/catdoc/catdoc');
into
('PHPDIG_PARSE_MSWORD','c://apache/htdocs/catdoc/catdoc/catdoc.exe');
This was necessary for the function phpdigTempFile in robot_functions.php, to make true:
&& file_exists(PHPDIG_PARSE_MSWORD)
&& is_executable(PHPDIG_PARSE_MSWORD
... but I am not sure if that causes problems else...

I also changed
return array('tempfile'=>$tempfile,'tempfilesize'=>$tempfilesize);
as suggested in 1.6.2 fix to crawl binary files

At the end in the function phpdigTempFile there is the code
rename($tempfile,$tempfile.'2');
exec($command,$result,$retval);
unlink($tempfile.'2');
if (!$retval)
I can see that rename and unlink work oke.
In admin/temp files are created and unlinked like:
9883dacfac81cb7b7830b3d1b09ea72c.tmp2
These files contain the words from the doc-file.
So catdoc seems to work fine.
But if (!$retval) is false, so exec() seems not to work.

Thanx again so far,
T
Tanasja is offline   Reply With Quote
Old 11-07-2003, 01:55 PM   #8
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. The catdoc.exe binary should create files with txt extensions. The tmp2 extensions are from the rename command.

From your post came the below two items:
  1. define('PHPDIG_PARSE_MSWORD','c://apache/htdocs/catdoc/catdoc');
  2. define('PHPDIG_PARSE_MSWORD','c://apache/htdocs/catdoc/catdoc/catdoc.exe');
The first item says that catdoc.exe is located in the c://apache/htdocs/catdoc/ directory whereas the second item says that catdoc.exe is the c://apache/htdocs/catdoc/catdoc/ directory. The path to catdoc.exe should be like this: c://apache/htdocs/catdoc/catdoc/catdoc (assuming the first two catdocs are directories and the last catdoc is for the catdoc.exe file).
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
catdoc with WinXP sandychan External Binaries 0 07-12-2006 06:50 PM
catdoc with windows fred External Binaries 1 07-29-2004 10:06 AM
compile catdoc loicoco External Binaries 2 04-30-2004 04:45 AM
Catdoc garbage Hoek External Binaries 3 02-23-2004 01:57 PM
catdoc mario External Binaries 1 10-28-2003 07:13 PM


All times are GMT -8. The time now is 12:48 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.