View Single Post
Old 11-30-2004, 08:24 AM   #1
indeh
Green Mole
 
Join Date: Oct 2004
Posts: 3
I was experiencing the same problem as Ensim, namely an attempt to spider from the command line would always return "No link in temporary table". I took the time to trace spider.php, and found that setting LIMIT_TO_DIRECTORY to false in config.php solved the problem.

The code in question is at or about line 270 of spider.php (my line numbers may be a little off since I tidied the code up a bit to aid in reading). Specifically:

PHP Code:
if (!(LIMIT_TO_DIRECTORY)) {
    if (
$links_per_lev == 0) {
        
$query_tempspider "INSERT INTO ".PHPDIG_DB_PREFIX."tempspider (site_id,file,path) SELECT site_id,file,path FROM ".PHPDIG_DB_PREFIX."spider WHERE site_id=$site_id $andmore_tempspider";
        
mysql_query($query_tempspider,$id_connect);
    }
    else {
        
$query_count_lev mysql_query("SELECT COUNT(*) as cnt FROM ".PHPDIG_DB_PREFIX."tempspider WHERE site_id = $site_id and level = 0",$id_connect);
        
$query_count_arr mysql_fetch_array($query_count_lev);
        
$query_count_num $query_count_arr['cnt'];
        if (
$query_count_num $links_per_lev) {
            
$level_lim $query_count_num $links_per_lev;
            
$query_tempspider "DELETE FROM ".PHPDIG_DB_PREFIX."tempspider WHERE level = 0 LIMIT $level_lim";
                        
mysql_query($query_tempspider,$id_connect);
            
$flag_for_inserts_check1 1;
        }
        elseif ((
$links_per_lev $query_count_num) &&
                (
$flag_for_inserts_check1 == 0)) {
            
$level_lim $links_per_lev $query_count_num;
            
$query_tempspider "INSERT INTO ".PHPDIG_DB_PREFIX."tempspider (site_id,file,path) SELECT site_id,file,path FROM ".PHPDIG_DB_PREFIX."spider WHERE site_id=$site_id $andmore_tempspider LIMIT $level_lim";
            
mysql_query($query_tempspider,$id_connect);
        }
    }

It seems that if LIMIT_TO_DIRECTORY is set to true, the tempspider table is never populated and spidering never begins. Please correct me if I'm wrong, though, since I only studied it enough to get it working for me .

For the record, I'm running spider.php as follows:
Code:
php -f /path/to/my/site/dig/admin/spider.php all
I have a single site in the database with the site_url formatted 'http://www.domain.com/' (with the trailing slash)
indeh is offline   Reply With Quote