PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Mod Submissions (http://www.phpdig.net/forum/forumdisplay.php?f=24)
-   -   Crawler speed improvement (although affects limit) (http://www.phpdig.net/forum/showthread.php?t=2756)

marco 03-23-2007 07:14 AM

Crawler speed improvement (although affects limit)
 
I had the problem phpdigExplore() returns to many duplicate links. This caused the spider to check 100s of duplicate URLs, which caused a slowdown, and the 1000 pages limit was hit quite fast.

Finally I added the following code at the end of phpDigExplore():
PHP Code:

if(!$_SESSION["links"]) $_SESSION["links"]=array();
$resultlinks = array();
foreach(
$links as $link){
    if(!
array_search($link$_SESSION["links"])){
        
$_SESSION["links"][]=$link;
        
$resultlinks[]=$link;
    }
}
return 
$resultlinks

I don't know whether this modification is useful or harms other components. But for the moment, it works.


All times are GMT -8. The time now is 12:33 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.