View Single Post
Old 02-06-2005, 01:25 AM   #1
WebSpider
Green Mole
 
Join Date: Feb 2005
Posts: 16
How to index other pages but not farther from them?

I'll try to explain it as clear as possible:

I spider a www.domainA.com which has links to www.domainX.com www.domainY.com and www.domainZ.com

How do i set up the digger to spider ALL links in domainA.com (domainX, domainY and domainZ) PLUS entering and spidering each of those links but not outside them?

So:

www.domainA.com
|
|\- www.domainX.com: grab links to domain1, domain2 and domain3.com
|
|\- www.domainY.com: grab links to domain4.com
|
\-- www.domainZ.com: grab links to domain5 and domain6.com

In this figure, my DB would contain

domainA, domainX, domainY, domainZ, domain1 to domain6 but not farther from domain1 to 6.

Is it clear enough?
WebSpider is offline   Reply With Quote