PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   External Binaries (http://www.phpdig.net/forum/forumdisplay.php?f=36)
-   -   no msword to txt parsing (http://www.phpdig.net/forum/showthread.php?t=1055)

lolodev 07-09-2004 10:22 AM

no msword to txt parsing
 
hello

(i've 1.8.1 and 1.8.0 version on my site)

i made a simple test page as

<a href="http://quito.citipo.fr/modules/documents/rep2/DocUtil.doc">Docutilisateur</a><br>

--
i indexe it ... a temporary file is created in admin/temp/xxxx.tmp for this .doc

but it seems that this file is not parse as txt file with phpdig

---

i don't know why ???

thanks

lolodev 07-09-2004 11:15 AM

no msword indexing
 
hello

i continue my test.

i put an echo at line 461 from spider.php script.

my script to index is : test.php
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Sans titre</title>
</head>
<body>
<a href="http://quito.citipro.fr/modules/documents/rep2/DocUtil.doc">Docutilisateur</a><br>
</body>
</html>


the result is:

SITE : http://quito.citipro.fr/
Exclude paths :
- @NONE@
Resource id #5**../admin/temp/81475511.tmp**245**15********
test.php**HTML**20040709211142**20040709211125**Array**
1:http://quito.citipro.fr/test.php
(time : 00:00:22)
+
level 1...
Resource id #5**0**0**15******modules/documents/rep2/**
DocUtil.doc**MSWORD**20040709211152**20040708082318****
2:http://quito.citipro.fr/modules/docu...p2/DocUtil.doc
(time : 00:00:32)

No link in temporary table

there is no temporary file for msword ...

thanks

Charter 07-09-2004 11:20 AM

Hi. There is a checklist here to help with troubleshooting.

lolodev 07-10-2004 01:23 PM

always catdoc
 
hello

thanks you for posting thread- i check your list and all your request are good - but ...

when i indexe my .doc, response is:

Command is: /home/mutualiseweb/catdoc-0.93.3/catdoc -s 8859-1 ../admin/temp/44148632.tmp
Result contains: Array ( )
Return value is: 127

but nothing is record in the database

i try a command line with catdoc on my linux OS, catdoc runs well my MSWORD

what happend ??

Are there frenchies users in this forum ??

Charter 07-10-2004 01:33 PM

Hi. In robot_functions.php find:
PHP Code:

$command PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2

and replace with:
PHP Code:

$command PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2.' 2>&1'

to see what issue occurs.

lolodev 07-10-2004 01:44 PM

hi (23:44 in france)

here response with the code modification:

Command is: /home/mutualiseweb/catdoc-0.93.3 -s 8859-1 ../admin/temp/38346732.tmp 2>&1
Result contains: Array ( [0] => sh: line 1: /home/mutualiseweb/catdoc-0.93.3: is a directory )
Return value is: 126

strange: when i use a command line /home/mutualiseweb/catdoc -s 8859-1 mymsword.doc, catdoc runs - but when i change define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc-0.93.3');

with define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc);, phpdig not recognize my msword file

Charter 07-10-2004 01:47 PM

Hi. Does this work?
PHP Code:

define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc-0.93.3/catdoc'); 


lolodev 07-10-2004 01:49 PM

lol, i try this before your post

No! doesn't work

Charter 07-10-2004 01:51 PM

Hi. What does
PHP Code:

define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc-0.93.3/catdoc'); 

give you when you use
PHP Code:

$command PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2.' 2>&1'


lolodev 07-10-2004 01:53 PM

Command is: /home/mutualiseweb/catdoc-0.93.3/catdoc -s 8859-1 ../admin/temp/39511712.tmp 2>&1
Result contains: Array ( [0] => sh: line 1: /home/mutualiseweb/catdoc-0.93.3/catdoc: No such file or directory )
Return value is: 127

Charter 07-10-2004 01:56 PM

Hi. What does
PHP Code:

define('PHPDIG_PARSE_MSWORD','/home/mutualiseweb/catdoc'); 

give you when you use
PHP Code:

$command PHPDIG_PARSE_MSWORD.' '.PHPDIG_OPTION_MSWORD.' '.$tempfile2.' 2>&1'

Also, is catdoc 755 permission?

lolodev 07-10-2004 02:02 PM

OK !!
all is my fault

my catdoc is under /home/mutualiseweb/catdoc-0.93.3/src/ MY GOD

a little question with .pdf files: is it necessary to install GHOST ??

:))) sorry

lolodev 07-10-2004 02:03 PM

THANKS LOT

Charter 07-10-2004 02:11 PM

LOL, paths and permissions. ;)

For PDFs perhaps try getting pdftotext already compiled. Directions are in this post.


All times are GMT -8. The time now is 11:44 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.