PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Troubleshooting (http://www.phpdig.net/forum/forumdisplay.php?f=22)
-   -   Command Line Spider spiders all sites (http://www.phpdig.net/forum/showthread.php?t=448)

Wayne McBryde 01-27-2004 12:50 PM

Command Line Spider spiders all sites
 
I’m still working to install 1.8.0. I’m building a new database and have a LOT of sites to spider. I created 9 text files with domain names, url_list_1.txt through url_list_9.txt.
When I entered “php –f spider.php url_list_1.txt” the spider, spidered the sites in the text file. When I enter “php –f spider.php url_list_2.txt” the spider, spiders the sites in list 2 then respiders the sites from list 1. Is this normal, or am I doing something wrong?

Charter 01-27-2004 01:57 PM

Hi. If you are still using version 1.6.5, then PhpDig will spider similar to that. Once you upgrade to 1.8.0, only the ULRs in each file will be crawled.

Wayne McBryde 01-27-2004 04:37 PM

It is 1.8.0 that I am having this problem with.

Charter 01-27-2004 05:15 PM

Hi. Between runs, check that the tempspider table is empty. If it's not empty, then empty it. You can do this by clicking the delete button from the admin panel without selecting a site, or run the following query:
Code:

DELETE FROM tempspider;
Sometimes things can get left in the tempspider table when there is no error but the corresponding page hasn't been indexed. This can happen if the spidering process is terminated prematurely.


All times are GMT -8. The time now is 06:14 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.