catdoc problem with WinXP
Hi all
I am using phpdig 1.8.4 on winXP (Windows NT SERVER 5.1 build 2600 ) with easyPHP 1.7 (PHP Version 4.3.3) I am trying to index .doc files (to start with) with the spider but so far no luck... When i used catdoc in command line, i get this : --- catdoc ./test.doc Banane Fruit Abricot --- those are the words in my doc file. So i guess catdoc.exe is working But when i try to index the file using phpdig, here is what i get : --- SITE : http://server/ Chemins exclus : - @NONE@ 1:http://server/moteur/catdoc/test.doc (temps : 00:00:07) Pas de liens dans la table temporaire liens trouvés : 1 http://server/moteur/catdoc/test.doc Optimizing tables... Indexation terminée ! --- its look like its not indexing that file Here is my config file PHP Code:
PHP INFO : Safe_mode OFF allow_url_fopen ON --- robot_functions.php : PHP Code:
thanx for your help... :bang: |
Post the info that gets printed from this thread.
|
thanx Charter for replying...
I tried already all codes changes and here is what i get now when trying to index a pdf file : --- SITE : http://10.1.0.181/ Chemins exclus : - @NONE@ Is result test http an array: 1 What is result test http status: PDF Is result test an array: 1 What is result test status: PDF Use is executable is set to: 0 Index the pdf is set to: 1 Parse the pdf is set to: d:\serveur\www\moteur\xpdf\pdftotext.exe Does parse pdf exist: 1 --- and it stop there... nothing happened after that line... :confused: but when i try in command line, its ok, i get the txt file right |
Maybe one of the following links might help?
http://www.phpdig.net/forum/showthread.php?t=1407 http://www.phpdig.net/forum/showthread.php?t=534 |
Hi again
i ve been to : http://www.phpdig.net/forum/showthread.php?t=1407 and i ve done the same change and still no luck... @ Charter is there any way for me to contact mleray via the forum as she has exactly the same config than mine (easyphp1.7 WinXP) and its look like she found the solution ? I can try to write a reply to her post but last time she came around was in october 2004 (2months ago)... |
When you use the following, what does it print out?
PHP Code:
* Just a general comment, not directed to anyone in particular: This bump is the exception, not the rule, so don't expect me to bump old threads even if asked. Thanks. |
here is where i stand for now:
pdf files are indexing but no way for word or xls. For those waiting for an answer : My config is WinXP SP2, EasyPHP 1.7 (PHP 4.3.3) EasyPHP is installed in 'd:\serveur' Phpdig is installed in 'd:\serveur\www\moteur' My config file for phpdig PHP Code:
i am using Xpdf/pdftotext availaible here : ftp://ftp.foolabs.com/pub/xpdf/ -- (http://www.foolabs.com/xpdf/download.html) get 'xpdf-3.00-win32.zip' 1,08Mb Warning : It cannot index pdf file which are password protected ! AND : shut down ALL firewall on your machine before indexing. as soon as i ve got the answer for doc and xls file, i ll post the answer. hope that this will help Xperienss |
Change:
PHP Code:
PHP Code:
|
i tried that already and no change
(checked)431:http://xxx/budgetTresorerie.pdf (temps : 00:58:01) (not checked)432:http://xxx/budgetTreso.doc (temps : 00:58:07) still not indexing .doc and .xls file but i don't give up and i ll find the solution soon or later ;) |
okay i see what s wrong now
when i try to index .doc file, catdoc.exe seems to see the file but don't create the outpout file and store that file to the right directory. the same when i run catdoc in MS-DOSS catdoc read the info from the doc file but it doesn't print out any file i can see the infos inside my MS-DOSS window but no file is created anyone s got any idea what command we need to use ? catdoc manual : http://www.45.free.net/~vitus/ice/ca...atdoc.man.html i tried : ------- catdoc -s 8859-1 -f ascii ../../test/test.doc Test Fichier Word -------- it read the texte from the doc file but doesn't create any file |
All times are GMT -8. The time now is 07:08 PM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.