PDA

View Full Version : Memory allocation error


olivier
02-11-2005, 11:53 AM
Hello,

I am trying to index ppt,xls,doc and pdf...

I am getting the following error with phpdig 1.8.7:
PHP Fatal error: Allowed memory size of 16777216 bytes exhausted (tried to allocate 8193 bytes) in /usr/local/www/data-dist/phpdig/admin/robot_functions.php on line 1026

I tried to change the php.ini to 16m as you can see in this error message without result (there was a previous message with 8m)

I have followed the Readme file in the forum until the configuration change in phpdig file (I try to avoid that if I can).

I have seen in an another thread that this can be linked to pstotext??

Someone has any ideas about the way I can troubleshoot that?

thanks,
Olivier.

Charter
02-11-2005, 12:05 PM
Perhaps the following thread might help...

http://www.phpdig.net/forum/showthread.php?t=534

olivier
02-11-2005, 12:52 PM
Yes, it was the thread I was referring to... It seems that I have the same problem. But sadly, there is no problem resolution at the end of this thread...

I tried to remove the pdf processing but I have the same result.

I will try your script...

Do you know what can be a raisonable limit for the memory limit in the php.ini file? The swap size? The Ram size? Half of it?

Thanks,
Olivier.

Charter
02-11-2005, 02:35 PM
Perhaps check the binary program homepages to see if there are any options to, for example, grab text from a subset of pages instead of the whole document. Maybe that might require less memory, depending on how it works. Basically, the (lack of) solution posted in the the other thread is a means to skip over larger documents. PHP itself is running out of memory when trying to index those larger documents in one fail swoop. If you are able, try 32MB or 64MB, but other than that, I don't have a way to keep PhpDig going when PHP runs out of memory.

olivier
02-13-2005, 03:48 PM
Is there a way to automatically ignore large file with phpdig?

because I can't remove all those files manually, there are too much of it...

thanks,
Olivier.

Charter
02-13-2005, 04:56 PM
You could try the code in the other thread. If that doesn't work, you could look in robot_functions.php and find:

if (in_array($result_test['status'],array('MSWORD','MSEXCEL','PDF','MSPOWERPOINT'))) {
$bin_file = 1;
$file_content = array();
$fp = fopen($uri,"rb");
while (!feof($fp)) {
$file_content[] = fread($fp,8192);
}
fclose($fp);
}
else {
$bin_file = 0;
$file_content = phpdigGetUrl($uri,$result_test['cookies']);
}

And replace with:

if (in_array($result_test['status'],array('MSWORD','MSEXCEL','PDF','MSPOWERPOINT'))) {
$bin_file = 1;
$file_content = array();
$fp = fopen($uri,"rb");
$oh_stop_me = 0;
while (!feof($fp) && $oh_stop_me < XXXXX) {
$file_content[] = fread($fp,8192);
$oh_stop_me++;
}
fclose($fp);
}
else {
$bin_file = 0;
$file_content = phpdigGetUrl($uri,$result_test['cookies']);
}

Where XXXXX is the number of iterations you want to go. Alternatively, you could modify the code to check for a size/length sum at each while-step and try to limit that way.

If you do this, you might run into a problem should a binary file get truncated in an inappropriate location, so you'd need to figure out how to define a 'correct' cut point.

I don't have any other ideas at the moment.

olivier
02-17-2005, 03:37 AM
Thanks Charter,

I will try all this.