View Single Post
Old 01-09-2005, 09:15 AM   #5
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
If the link were as follows:
Code:
http://www.domain.com/dir/0,,contentMDK:20295425~pagePK:64156298~piPK:64152276~theSitePK:489784,00.html
Then the request sent to the server is as follows:
Code:
127.0.0.1 - - [09/Jan/2005:10:00:30 -0800] "HEAD /20295425~pagePK:64156298~piPK:64152276~theSitePK:489784,00.html HTTP/1.1" 404 0 "-" "PhpDig/1.8.6 (+http://www.phpdig.net/robot.php)"
See how at the first ":" it busts?

There are actually two spots in robot_functions.php to edit:

- One
Code:
while (eregi("(<frame[^>]*src[[:blank:]]*=|href[[:blank:]]*=|http-equiv=['\"]refresh['\"] *content=['\"][0-9]+;[[:blank:]]*url[[:blank:]]*=|window[.]location[[:blank:]]*=|window[.]open[[:blank:]]*[(])[[:blank:]]*[\'\"]?((([a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9\|+ ()~-]*))(#[.a-zA-Z0-9-]*)?[\'\" ]?",$eval,$regs)) {
- Two
Code:
while(eregi("<a([^>]*href[[:blank:]]*=[[:blank:]]*[\'\"]?((([a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9 ()~-]*))[#\'\" ]?)",$line,$regs)) {
I don't have the time now to further diagnose, but maybe this tidbit will help you edit the two regexs.

Oh, and vB inserts space if there are too many chars in a row without space, so take that into account when considering the code posted herein.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote