Thread: SPACE IN url
View Single Post
Old 02-06-2004, 08:44 AM   #7
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. In robot_functions.php are two functions to edit.

First, in phpdigExplore find:
PHP Code:
while (eregi("(<frame[^>]*src[[:blank:]]*=|href[[:blank:]]*=|http-equiv=['\"]refresh['\"] *content=['\"][0-9]+;url[[:blank:]]*=|window[.]location[[:blank:]]*=|window[.]open[[:blank:]]*[(])[[:blank:]]*[\\\\'\"]?((([[a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\\\,._a-zA-Z0-9\\|+-]*))(#[.a-zA-Z0-9-]*)?[\\\\'\" ]?",$eval,$regs)) { 
and replace with:
PHP Code:
while (eregi("(<frame[^>]*src[[:blank:]]*=|href[[:blank:]]*=|http-equiv=['\"]refresh['\"] *content=['\"][0-9]+;url[[:blank:]]*=|window[.]location[[:blank:]]*=|window[.]open[[:blank:]]*[(])[[:blank:]]*[\\\\'\"]?((([[a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\\\,._a-zA-Z0-9\\|+ ()-]*))(#[.a-zA-Z0-9-]*)?[\\\\'\" ]?",$eval,$regs)) { 
Second, in phpdigIndexFile find:
PHP Code:
while (eregi("<a([^>]*href[[:blank:]]*=[[:blank:]]*[\\\\'\"]?(((http://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\\\,._a-zA-Z0-9-]*))[#\\\\'\" ]?)",$line,$regs)) { 
and replace with:
PHP Code:
while (eregi("<a([^>]*href[[:blank:]]*=[[:blank:]]*[\\\\'\"]?(((http://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\\\,._a-zA-Z0-9\\|+ ()-]*))[#\\\\'\" ]?)",$line,$regs)) { 
Now try another reindex. What are the results?

Remember to remove any "word" wrapping in the above code.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote