![]() |
Use Antiword instead of catdoc on Wintel
I've been integrating phpdig on a Windows 2003 serveur.
There's a problem with catdoc on this platform. - The official provider of catdoc is not supporting windows but DOS. - The unofficial provider of catdoc for windows is using an older version of the product. I don't know about the new version but with the old here are the problems : - images within document are not skipped. They are transformed into text ... So the result is a very huge text and a wrong indexing. - performance are bad since the program is using "standard output" and is not offering "file output". phpdig is going faster when using "file output". - You take the two points above a you just have an everlasting indexing process (+ your server climb to 99% CPU and you lost the contact with it). I've tried antiword instead. The images are correctly skipped and the performance are 10 times faster. |
Thanks for pointing that out I shall look at using it to see how well it performs.
It is always good to read about other implementations. |
All times are GMT -8. The time now is 10:33 AM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.