View Single Post
Old 11-25-2003, 09:17 AM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. You might try adding a robots.txt file in web root with the following, assuming it's the index.html to the main site that you don't want to crawl:

User-agent: PhpDig
Disallow: index.html

To remove the '-' index links that were crawled, go to the admin panel, click a site, click the update button, click a blue arrow, and on the right side, click a red X for those links you want to delete.

Another option, if you have shell access, would be to crawl via command line using a text file, where only the links you want crawled are in the text file, one per line. There are three options in the config file (SPIDER_MAX_LIMIT, SPIDER_DEFAULT_LIMIT, RESPIDER_LIMIT) that can be set to limit the number of levels crawled when using shell to index.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote