![]() |
no msword to txt parsing
hello
(i've 1.8.1 and 1.8.0 version on my site) i made a simple test page as <a href="http://quito.citipo.fr/modules/documents/rep2/DocUtil.doc">Docutilisateur</a><br> -- i indexe it ... a temporary file is created in admin/temp/xxxx.tmp for this .doc but it seems that this file is not parse as txt file with phpdig --- i don't know why ??? thanks |
no msword indexing
hello
i continue my test. i put an echo at line 461 from spider.php script. my script to index is : test.php <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>Sans titre</title> </head> <body> <a href="http://quito.citipro.fr/modules/documents/rep2/DocUtil.doc">Docutilisateur</a><br> </body> </html> the result is: SITE : http://quito.citipro.fr/ Exclude paths : - @NONE@ Resource id #5**../admin/temp/81475511.tmp**245**15******** test.php**HTML**20040709211142**20040709211125**Array** 1:http://quito.citipro.fr/test.php (time : 00:00:22) + level 1... Resource id #5**0**0**15******modules/documents/rep2/** DocUtil.doc**MSWORD**20040709211152**20040708082318**** 2:http://quito.citipro.fr/modules/docu...p2/DocUtil.doc (time : 00:00:32) No link in temporary table there is no temporary file for msword ... thanks |
Hi. There is a checklist here to help with troubleshooting.
|
always catdoc
hello
thanks you for posting thread- i check your list and all your request are good - but ... when i indexe my .doc, response is: Command is: /home/mutualiseweb/catdoc-0.93.3/catdoc -s 8859-1 ../admin/temp/44148632.tmp Result contains: Array ( ) Return value is: 127 but nothing is record in the database i try a command line with catdoc on my linux OS, catdoc runs well my MSWORD what happend ?? Are there frenchies users in this forum ?? |
Hi. In robot_functions.php find:
PHP Code:
PHP Code:
|
hi (23:44 in france)
here response with the code modification: Command is: /home/mutualiseweb/catdoc-0.93.3 -s 8859-1 ../admin/temp/38346732.tmp 2>&1 Result contains: Array ( [0] => sh: line 1: /home/mutualiseweb/catdoc-0.93.3: is a directory ) Return value is: 126 strange: when i use a command line /home/mutualiseweb/catdoc -s 8859-1 mymsword.doc, catdoc runs - but when i change define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc-0.93.3'); with define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc);, phpdig not recognize my msword file |
Hi. Does this work?
PHP Code:
|
lol, i try this before your post
No! doesn't work |
Hi. What does
PHP Code:
PHP Code:
|
Command is: /home/mutualiseweb/catdoc-0.93.3/catdoc -s 8859-1 ../admin/temp/39511712.tmp 2>&1
Result contains: Array ( [0] => sh: line 1: /home/mutualiseweb/catdoc-0.93.3/catdoc: No such file or directory ) Return value is: 127 |
Hi. What does
PHP Code:
PHP Code:
|
OK !!
all is my fault my catdoc is under /home/mutualiseweb/catdoc-0.93.3/src/ MY GOD a little question with .pdf files: is it necessary to install GHOST ?? :))) sorry |
THANKS LOT
|
LOL, paths and permissions. ;)
For PDFs perhaps try getting pdftotext already compiled. Directions are in this post. |
All times are GMT -8. The time now is 11:44 PM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.