If the link were as follows:
Code:
http://www.domain.com/dir/0,,contentMDK:20295425~pagePK:64156298~piPK:64152276~theSitePK:489784,00.html
Then the request sent to the server is as follows:
Code:
127.0.0.1 - - [09/Jan/2005:10:00:30 -0800] "HEAD /20295425~pagePK:64156298~piPK:64152276~theSitePK:489784,00.html HTTP/1.1" 404 0 "-" "PhpDig/1.8.6 (+http://www.phpdig.net/robot.php)"
See how at the first ":" it busts?
There are actually two spots in robot_functions.php to edit:
- One
Code:
while (eregi("(<frame[^>]*src[[:blank:]]*=|href[[:blank:]]*=|http-equiv=['\"]refresh['\"] *content=['\"][0-9]+;[[:blank:]]*url[[:blank:]]*=|window[.]location[[:blank:]]*=|window[.]open[[:blank:]]*[(])[[:blank:]]*[\'\"]?((([a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9\|+ ()~-]*))(#[.a-zA-Z0-9-]*)?[\'\" ]?",$eval,$regs)) {
- Two
Code:
while(eregi("<a([^>]*href[[:blank:]]*=[[:blank:]]*[\'\"]?((([a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9 ()~-]*))[#\'\" ]?)",$line,$regs)) {
I don't have the time now to further diagnose, but maybe this tidbit will help you edit the two regexs.
Oh, and vB inserts space if there are too many chars in a row without space, so take that into account when considering the code posted herein.