PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > How-to Forum

Reply
 
Thread Tools
Old 03-13-2004, 12:01 PM   #1
kenazo
Green Mole
 
Join Date: Mar 2004
Posts: 5
Searching external domains/links

Hi! I'm brand shiny new to search engines and am not clear on how this engine searches external domains relative to 'my' domain.

Does it follow links from 'my' domain (let's say www.mine.com) to external domains (let's say www.outside.com)? Thus if I have a link to an external domain will it follow that link and index those pages also? If so can I set the depth it searches on those domains?

In relation to this can I simply set it to search only www.mine.com and not follow external links? Can I set a list of 20 domains and have it index only those domains?

Phew...hope that is clear!

Thanks.
kenazo is offline   Reply With Quote
Old 03-13-2004, 01:25 PM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
PhpDig is set to crawl links from one site using the admin panel. By indexing from shell a list of URLs can be specified, one per line in a text file. To crawl links from site to external site, set PHPDIG_IN_DOMAIN to true in the config file and apply the code change in this thread.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 03-13-2004, 01:37 PM   #3
kenazo
Green Mole
 
Join Date: Mar 2004
Posts: 5
Thanks for the quick reply

So if I set PHPDIG_IN_DOMAIN to true, can I then specify the depth it will dig those external links to? (if not it would obviously get out of control!). Is that depth just considered as part of the depth set in the main search? Or does it start from scratch when it hits a new domain?
kenazo is offline   Reply With Quote
Old 03-14-2004, 02:55 PM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. If you (a) set PHPDIG_IN_DOMAIN to true in the config.php file and (b) set the else part of the phpdigCompareDomains function to true in the robot_functions.php file, then it is possible to wind up in a loop. To avoid this loop, use the files in the attached ZIP file below. The files in the attached ZIP apply point (b) above and are for use with version 1.8.0.

As for search depth, using the files in the attached ZIP file to avoid the possible aforementioned loop, then search depth gets applied to each different (sub)domain found, so in theory, it would be possible to index site to linked site to linked site, etcetera, where the search depth specified gets applied to each different site.
Attached Files
File Type: zip files.zip (19.3 KB, 71 views)
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Limit of spidering external domains Vadim How-to Forum 0 11-17-2006 09:53 AM
spidering external links websearch How-to Forum 1 01-11-2005 08:39 AM
Wildcard for banned external links? Slider How-to Forum 5 12-19-2004 08:07 AM
Spider External links to a depth of 1 (1.8.3) kenazo How-to Forum 0 10-20-2004 06:28 AM
redirect to external domains sf44 Troubleshooting 4 07-03-2004 11:56 PM


All times are GMT -8. The time now is 02:49 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.