PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   External Binaries (http://www.phpdig.net/forum/forumdisplay.php?f=36)
-   -   can't index pdf using pdftotext (http://www.phpdig.net/forum/showthread.php?t=1158)

rom 08-12-2004 12:26 PM

Hi Charter,

I read through the memory thread. Looked up my memory_limit, which is 10 M. Tried also this code for memory_get_usage from the php.net site:

<?php
// This is only an example, the numbers below will
// differ depending on your system
echo memory_get_usage() . "\n"; // 36640
$a = str_repeat("Hello", 4242);
echo memory_get_usage() . "\n"; // 57960
unset($a);
echo memory_get_usage() . "\n"; // 36744
?>

My server returned this:
16704 38000 16784

I know now, based on when the spidering ends, that it is getting hung up on one 4.6 M pdf.

From the memory thread, I wasn't sure what else to do, because at the end of the thread Tomas says nothing worked. Is there something that can be done to skip over that one file?

Thanks again.

Rom

Charter 08-15-2004 03:15 PM

Hi. Did you try something like in this post?

rom 08-25-2004 04:05 PM

hi charter,

tried your suggestion above. the indexing just stops part way through. seems to be when it encounters a 4.6 M file. it doesn't want to skip over it.

thanks,

rom

Charter 08-25-2004 07:10 PM

Hi. Did you try this too?

rom 08-26-2004 09:02 AM

Hi Charter,

Tried that also, but again it stops part way through indexing, when it reaches the 4.6 M file.

Thanks,

Rom

Charter 08-26-2004 10:21 AM

Assuming you are using 1.8.3, try moving this code:
PHP Code:

if (memory_get_usage() + 1000000 3000000) {
    return array(
'tempfile'=>0,'tempfilesize'=>0);


to be right after the following in the robot_functions.php file:
PHP Code:

// $file_content = @file($uri); ///////////////////////////////////////////////// 


rom 08-26-2004 01:55 PM

i'm using 1.8.0. should i upgrade first?

rom 08-27-2004 04:11 PM

I tried moving the lines as directed. Still stops indexing part way through at the same spot.


All times are GMT -8. The time now is 10:50 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.