View Single Post
Old 04-07-2004, 06:39 AM   #1
greener_02445
Green Mole
 
Join Date: Apr 2004
Posts: 8
catdoc and xls2csv not indexing

Can anyone help me? I have been trying to get word documents
and excel files to index. I am using apache on a win XP system. It will work for text files only. this is how my config settings look :


define('USE_IS_EXECUTABLE_COMMAND','0'); //use is_executable for external binaries
// if set to true, full path to external binary required
define('PHPDIG_INDEX_MSWORD',true);
define('PHPDIG_PARSE_MSWORD','C:\catdoc\catdoc');
define('PHPDIG_OPTION_MSWORD','-s 8859-1');
define('PHPDIG_INDEX_PDF',true);
define('PHPDIG_PARSE_PDF','/usr/local/bin/pstotext');
define('PHPDIG_OPTION_PDF','-cork');
define('PHPDIG_INDEX_MSEXCEL',true);
define('PHPDIG_PARSE_MSEXCEL','C:\catdoc\xls2csv');
define('PHPDIG_OPTION_MSEXCEL','');

//---------EXTERNAL TOOLS EXTENSIONS
// if external binary is not STDOUT or different extension is needed
// for example, use '.txt' if external binary writes to filename.txt
define('PHPDIG_MSWORD_EXTENSION','');
define('PHPDIG_PDF_EXTENSION','');
define('PHPDIG_MSEXCEL_EXTENSION','');

I have tried the xls2csv and the catdoc programs through the MSDOS interface and they work fine. When I try to submit a URI with a .doc or a .xls This is what I get:

SITE : http://localhost/
Exclude paths :
- @NONE@
No link in temporary table

--------------------------------------------------------------------------------

links found : 0
...Was recently indexed
Optimizing tables...
Indexing complete !

any advice muchly appreciated

-Rich
greener_02445 is offline   Reply With Quote