PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > External Binaries

Reply
 
Thread Tools
Old 09-07-2006, 06:29 AM   #1
SABsearch2
Awaiting Email
 
Join Date: Jul 2006
Posts: 3
Thumbs up Use Antiword instead of catdoc on Wintel

I've been integrating phpdig on a Windows 2003 serveur.

There's a problem with catdoc on this platform.
- The official provider of catdoc is not supporting windows but DOS.
- The unofficial provider of catdoc for windows is using an older version of the product.

I don't know about the new version but with the old here are the problems :
- images within document are not skipped. They are transformed into text ... So the result is a very huge text and a wrong indexing.
- performance are bad since the program is using "standard output" and is not offering "file output". phpdig is going faster when using "file output".
- You take the two points above a you just have an everlasting indexing process (+ your server climb to 99% CPU and you lost the contact with it).

I've tried antiword instead. The images are correctly skipped and the performance are 10 times faster.
SABsearch2 is offline   Reply With Quote
Old 10-04-2006, 01:24 AM   #2
Dave A
Purple Mole
 
Dave A's Avatar
 
Join Date: Aug 2004
Location: North Island New Zealand
Posts: 170
Thanks for pointing that out I shall look at using it to see how well it performs.
It is always good to read about other implementations.
Dave A is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
antiword tweaking code MTSC Troubleshooting 0 02-18-2007 06:32 AM
compile catdoc loicoco External Binaries 2 04-30-2004 04:45 AM
Catdoc garbage Hoek External Binaries 3 02-23-2004 01:57 PM
catdoc Tanasja External Binaries 7 11-07-2003 01:55 PM
catdoc mario External Binaries 1 10-28-2003 07:13 PM


All times are GMT -8. The time now is 11:59 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.