Indexing MS Word docs under Windows
I've had no success trying to index MS Word (.DOC) documents under Windows. I have:
Code:
define('PHPDIG_INDEX_MSWORD',true); Any help appreciated Phil |
Hi. Is USE_IS_EXECUTABLE_COMMAND set to true (one) or false (zero) in the config file?
|
USE_IS_EXECUTABLE_COMMAND is set at the default value of 1. But things have got worse ... :(
I decided to can 1.6.2 and try 1.6.5, so I removed all code and the DB tables and re-installed 1.6.5 - install seemed to go OK, but now I can't get past here: Code:
Spidering in progress... Phil |
|
That seemed to work - thanks!
|
we-ell, we're improving .... now it spiders OK without giving errors, but it still isn't indexing the contents of the .doc files ... I tried spidering directly to the URL of a .doc file I knew existed:
Code:
Spidering in progress... Any more help welcome. |
Hi. From the command line what does the following produce?
C:\\Program Files\\EasyPHP1-7\\www\\k3\\catdoc -s 8859-1 change-me-4.doc |
"Cannot load charset cp1251 - file not found"
|
OK, sorted out the charset paths, now seems to extract text OK from the command line, but still not via the web interface...:(
|
OK, all working; it seems that it didn't like the path name having a space in it at C:\\Program Files\\.......
Once I moved catdoc (and it's config subdirectories) to a path not requiring a space (C:\\ for instance) all was well. Many thanks for your help, guys. (Though I'm sure I'll be back with more dopy questions :) BTW my own requirement is for index searching on just one, local directory full of MS Word files. To facilitate this I have a file index.php which provides a link for the spider to all Word files in the directory: Code:
<HTML> All the best Phil |
All times are GMT -8. The time now is 08:37 AM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.