![]() |
|
![]() |
#1 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
not correct link collecting
On my site links are like this:
/index.php?razdel=about&mach[2]=20 But spider gets only /index.php?razdel=about&mach How to fix it? |
![]() |
![]() |
![]() |
#2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
There are two regexs in robot_functions.php to edit:
- One Code:
while (eregi("(<frame[^>]*src[[:blank:]]*=|href[[:blank:]]*=|http-equiv=['\"]refresh['\"] *content=['\"][0-9]+;[[:blank:]]*url[[:blank:]]*=|window[.]location[[:blank:]]*=|window[.]open[[:blank:]]*[(])[[:blank:]]*[\'\"]?((([a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9\|+ ()~-]*))(#[.a-zA-Z0-9-]*)?[\'\" ]?",$eval,$regs)) { Code:
while(eregi("<a([^>]*href[[:blank:]]*=[[:blank:]]*[\'\"]?((([a-z]{3,5}://)+(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9 ()~-]*))[#\'\" ]?)",$line,$regs)) { - One Code:
[:%/?=&;\\,._a-zA-Z0-9\|+ ()~-] Code:
[:%/?=&;\\,._a-zA-Z0-9 ()~-]
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
#3 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
THX, man!
- TWO Code:
[:%/?=&;\\,._a-zA-Z0-9 ()~-] ![]() PHP Code:
PHP Code:
PHP Code:
Last edited by zaartix; 01-13-2005 at 11:48 PM. |
![]() |
![]() |
![]() |
#4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Not working as in it throws an error?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
#5 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
no, spider gets only /index.php?razdel=about&mach without [] symbols
|
![]() |
![]() |
![]() |
#6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Okay, I see. The right bracket doesn't like being in a character class.
To get PhpDig to accept [ and ] in links, incorporate the following: PHP Code:
Code:
[:%/?=&;\\,._a-zA-Z0-9|+ ()~-] PHP Code:
Code:
($no_brackets\[?$no_brackets\]?$no_brackets) Code:
([:%/?=&;\\,._a-zA-Z0-9\|+ ()~-]*) Code:
([:%/?=&;\\,._a-zA-Z0-9 ()~-]*)
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
#7 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
thx man for excellent support
|
![]() |
![]() |
![]() |
#8 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
it's working for links like
http://www.domain.com/dir/index.php?razdel=about&mach[2]=20 so what if links will be like http://www.domain.com/dir/index.php?razdel=about&mach[2]=20&mach[2]=20&mach[2]=20&mach[2]=20 PHP Code:
http://www.domain.com/dir/index.php?razdel=about&mach[2]=20&mach Last edited by zaartix; 01-18-2005 at 08:15 PM. |
![]() |
![]() |
![]() |
#9 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
($allowed_link_chars\[?$allowed_link_chars\]?$allowed_link_chars)+
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
#10 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
delete this post plz
Last edited by zaartix; 01-19-2005 at 02:16 AM. |
![]() |
![]() |
![]() |
#11 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
delete this post plz
Last edited by zaartix; 01-19-2005 at 02:16 AM. |
![]() |
![]() |
![]() |
#12 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
i'll make small example
|
![]() |
![]() |
![]() |
#13 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
PHP Code:
|
![]() |
![]() |
![]() |
#14 |
Orange Mole
Join Date: May 2004
Location: russia, samara
Posts: 56
|
another trouble:
phpdig get links from this code: PHP Code:
PHP Code:
Last edited by zaartix; 01-19-2005 at 02:21 AM. |
![]() |
![]() |
![]() |
#15 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
![]() |
![]() |
![]() |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Excluding only one link | arena75 | How-to Forum | 5 | 10-10-2004 01:46 PM |
i need only the link, without the title | Fking | How-to Forum | 1 | 10-05-2004 05:29 PM |
Too many duplicate link, someone help please! | warrence | Troubleshooting | 1 | 09-07-2004 04:26 PM |
don't follow link | Onno | How-to Forum | 1 | 03-05-2004 09:45 AM |
Installation correct? | DrKamikaze83 | Script Installation | 1 | 02-16-2004 05:56 AM |