PDA

View Full Version : URL Bot


RaGe
04-29-2004, 05:59 AM
Thus far i've seen the spider functions only deal with spidering a particular site and returning only results within the spidered URL. An option that would allow the Admin to ignore the base URL and return only links to external URL's would allow for spidering of a link farm site or links page and harvesting the links back into PhP dig. For example:

I built a cgi engine and have tons of links indexed on it, if i use PhP dig to try to spider the links from the original engine, it returns MY url links instead of ignoring base url and spidering the external links at a depth of 1. Thus it is a URL harvester spider rather than just a site spider.

My cgi engine does this with the greatest of ease, i can spider a particular directory of DMOZ and bring back only the links and their relative URLS. If someone out there (in the Mole Squad) is proficient at both PhP and CGI i'd be willing to make my engine available and perhaps we can cross the spider functions into PhP dig and save some raw coding time for all.

It also features admin features for visitor added URL's that can be directly edited rather than just spidered. At this time i see no way of editing spidered or user submitted urls without doing such at an SQL level which might also be a useful PhPdig function to consider.