![]() |
not correct link collecting
On my site links are like this:
/index.php?razdel=about&mach[2]=20 But spider gets only /index.php?razdel=about&mach How to fix it? |
There are two regexs in robot_functions.php to edit:
- One Code:
while (eregi("(<frame[^>]*src[[:blank:]]*=|href[[:blank:]]*=|http-equiv=['\"]refresh['\"] *content=['\"][0-9]+;[[:blank:]]*url[[:blank:]]*=|window[.]location[[:blank:]]*=|window[.]open[[:blank:]]*[(])[[:blank:]]*[\'\"]?((([a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9\|+ ()~-]*))(#[.a-zA-Z0-9-]*)?[\'\" ]?",$eval,$regs)) { Code:
while(eregi("<a([^>]*href[[:blank:]]*=[[:blank:]]*[\'\"]?((([a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9 ()~-]*))[#\'\" ]?)",$line,$regs)) { - One Code:
[:%/?=&;\\,._a-zA-Z0-9\|+ ()~-] Code:
[:%/?=&;\\,._a-zA-Z0-9 ()~-] |
THX, man!
- TWO Code:
[:%/?=&;\\,._a-zA-Z0-9 ()~-] PHP Code:
PHP Code:
PHP Code:
|
Not working as in it throws an error?
|
no, spider gets only /index.php?razdel=about&mach without [] symbols
|
Okay, I see. The right bracket doesn't like being in a character class.
To get PhpDig to accept [ and ] in links, incorporate the following: PHP Code:
Code:
[:%/?=&;\\,._a-zA-Z0-9|+ ()~-] PHP Code:
Code:
($no_brackets\[?$no_brackets\]?$no_brackets) Code:
([:%/?=&;\\,._a-zA-Z0-9\|+ ()~-]*) Code:
([:%/?=&;\\,._a-zA-Z0-9 ()~-]*) |
thx man for excellent support
|
it's working for links like
http://www.domain.com/dir/index.php?razdel=about&mach[2]=20 so what if links will be like http://www.domain.com/dir/index.php?razdel=about&mach[2]=20&mach[2]=20&mach[2]=20&mach[2]=20 PHP Code:
http://www.domain.com/dir/index.php?razdel=about&mach[2]=20&mach |
($allowed_link_chars\[?$allowed_link_chars\]?$allowed_link_chars)+
|
delete this post plz
|
delete this post plz
|
i'll make small example
|
PHP Code:
|
another trouble:
phpdig get links from this code: PHP Code:
PHP Code:
|
|
All times are GMT -8. The time now is 01:59 PM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.