Hi. PhpDig looks for links that match the following regex and then processes those links via the phpdigRewriteUrl function.
PHP Code:
while (eregi("(<frame[^>]*src[[:blank:]]*=|href[[:blank:]]*=|http-equiv=['\\"]refresh['\\"] *content=['"][0-9]+;url[[:blank:]]*=|window[.]location[[:blank:]]*=|window[.]open[[:blank:]]*[(])[[:blank:]]*[\\'\\"]?((([[a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\\\,._a-zA-Z0-9\|+-]*))(#[.a-zA-Z0-9-]*)?[\\'\\" ]?",$eval,$regs)) {
In its current form, when PhpDig crawls from a dir1 directory, PhpDig would follow dir1/index2.html rather than go and crawl http://www.domain.com/dir2/index2.html.
Code:
<HTML>
<HEAD>
<BASE HREF="http://www.domain.com/dir2/index1.html">
</HEAD>
<BODY>
<A HREF="index2.html">test</A>
</BODY>
</HTML>