PDA

View Full Version : Searching external domains/links


kenazo
03-13-2004, 12:01 PM
Hi! I'm brand shiny new to search engines and am not clear on how this engine searches external domains relative to 'my' domain.

Does it follow links from 'my' domain (let's say www.mine.com) to external domains (let's say www.outside.com)? Thus if I have a link to an external domain will it follow that link and index those pages also? If so can I set the depth it searches on those domains?

In relation to this can I simply set it to search only www.mine.com and not follow external links? Can I set a list of 20 domains and have it index only those domains?

Phew...hope that is clear!

Thanks.

Charter
03-13-2004, 01:25 PM
PhpDig is set to crawl links from one site using the admin panel. By indexing from shell (http://www.phpdig.net/navigation.php?action=doc#toc8) a list of URLs can be specified, one per line in a text file. To crawl links from site to external site, set PHPDIG_IN_DOMAIN to true in the config file and apply the code change in this (http://www.phpdig.net/showthread.php?threadid=177) thread.

kenazo
03-13-2004, 01:37 PM
Thanks for the quick reply :)

So if I set PHPDIG_IN_DOMAIN to true, can I then specify the depth it will dig those external links to? (if not it would obviously get out of control!). Is that depth just considered as part of the depth set in the main search? Or does it start from scratch when it hits a new domain?

Charter
03-14-2004, 02:55 PM
Hi. If you (a) set PHPDIG_IN_DOMAIN to true in the config.php file and (b) set the else part of the phpdigCompareDomains function to true in the robot_functions.php file, then it is possible to wind up in a loop. To avoid this loop, use the files in the attached ZIP file below. The files in the attached ZIP apply point (b) above and are for use with version 1.8.0.

As for search depth, using the files in the attached ZIP file to avoid the possible aforementioned loop, then search depth gets applied to each different (sub)domain found, so in theory, it would be possible to index site to linked site to linked site, etcetera, where the search depth specified gets applied to each different site.