View Single Post
Old 04-15-2004, 01:32 PM   #9
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Part of this could be solved by adding DISTINCT to the following query (or make a join query) in the spider.php file:
PHP Code:
$query "SELECT DISTINCT(".PHPDIG_DB_PREFIX."sites.site_id),".PHPDIG_DB_PREFIX."sites.site_url,"
.PHPDIG_DB_PREFIX."sites.username as user,".PHPDIG_DB_PREFIX."sites.password as pass,"
.PHPDIG_DB_PREFIX."sites.port FROM ".PHPDIG_DB_PREFIX."sites,".PHPDIG_DB_PREFIX."tempspider WHERE "
.PHPDIG_DB_PREFIX."sites.site_id = ".PHPDIG_DB_PREFIX."tempspider.site_id"
This should make it so if file1 contains domainA, domainB then the bot1 array will only contain one instance of each domain. I say partly solved because once bot1 runs on domainA, domainB there will be stuff in the tempspider table, so when bot2 runs file2 containing domainC, domainD then the bot2 array will be domainA, domainB, domainC, domainD.

I suppose AND ".PHPDIG_DB_PREFIX."sites.locked = 0" could be added to the WHERE part of above query, but that still doesn't guarantee unique arrays across bots unless you make sure that each bot gets a chance to lock its sites before the next bot is fired up but before said bots unlock their sites. Even still the tempspider table would need to be emptied after all bots are done.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote