SABsearch2
09-07-2006, 06:29 AM
I've been integrating phpdig on a Windows 2003 serveur.
There's a problem with catdoc on this platform.
- The official provider of catdoc is not supporting windows but DOS.
- The unofficial provider of catdoc for windows is using an older version of the product.
I don't know about the new version but with the old here are the problems :
- images within document are not skipped. They are transformed into text ... So the result is a very huge text and a wrong indexing.
- performance are bad since the program is using "standard output" and is not offering "file output". phpdig is going faster when using "file output".
- You take the two points above a you just have an everlasting indexing process (+ your server climb to 99% CPU and you lost the contact with it).
I've tried antiword instead. The images are correctly skipped and the performance are 10 times faster.
There's a problem with catdoc on this platform.
- The official provider of catdoc is not supporting windows but DOS.
- The unofficial provider of catdoc for windows is using an older version of the product.
I don't know about the new version but with the old here are the problems :
- images within document are not skipped. They are transformed into text ... So the result is a very huge text and a wrong indexing.
- performance are bad since the program is using "standard output" and is not offering "file output". phpdig is going faster when using "file output".
- You take the two points above a you just have an everlasting indexing process (+ your server climb to 99% CPU and you lost the contact with it).
I've tried antiword instead. The images are correctly skipped and the performance are 10 times faster.