PDA

View Full Version : Use Antiword instead of catdoc on Wintel


SABsearch2
09-07-2006, 07:29 AM
I've been integrating phpdig on a Windows 2003 serveur.

There's a problem with catdoc on this platform.
- The official provider of catdoc is not supporting windows but DOS.
- The unofficial provider of catdoc for windows is using an older version of the product.

I don't know about the new version but with the old here are the problems :
- images within document are not skipped. They are transformed into text ... So the result is a very huge text and a wrong indexing.
- performance are bad since the program is using "standard output" and is not offering "file output". phpdig is going faster when using "file output".
- You take the two points above a you just have an everlasting indexing process (+ your server climb to 99% CPU and you lost the contact with it).

I've tried antiword instead. The images are correctly skipped and the performance are 10 times faster.

Dave A
10-04-2006, 02:24 AM
Thanks for pointing that out I shall look at using it to see how well it performs.
It is always good to read about other implementations.