![]() |
Quote:
working only if link contain only one pair of [] :( |
first regexp doesn't needed becourse site have'nt frames
|
>> working only if link contain only one pair of []
So it works in example but not with PhpDig? What's a link to a page containing multiple [ ] in its links? >> first regexp doesn't needed becourse site have'nt frames Other people might have frames though. ;) The RFC2732 protocol states in part: Quote:
You might want to consider encoding your URIs according to this rather than use literal square brackets in your links. |
>>So it works in example but not with PhpDig? What's a link to a page containing multiple [ ] in its links?
Yep. Just try to dig this page: http://zaartix.ru/krit Sorry for russian on that page |
That page contains tons of links to 404 pages.
|
they are all to 404 :)
so phpdig extract not all links from main page |
i'am not upload other pages, only one page.
for what other pages? if phpdig find all links which are on that page and all links are correct, then extractng regexp working right. Is it so? |
PhpDig tests links, and if PhpDig gets a 404 from a link, then PhpDig does not index that link. The + works in example, so maybe try setting up an online demo with a few links.
|
so, phpdig, when it parsing page, trying to open each of link? on first step? i think, that phpdig extracting all links and paste it in tempspider table. at next step phpdig try to open each of links.
I'am wrong? |
Nope, that is not how it works. PhpDig does not insert server response 404s in the tempspider table. With all the links currently returning 404s, the only thing inserted into the tempspider table is the zaartix.ru/krit/ page.
|
at now you can try to dig http://zaartix.ru/krit
plz, help to solve this problem |
There are no regular links with more than one set of [ ] square brackets in them. :confused:
|
There are many levels of pages. Just try to dig all aviable pages, mane different types of links :)
http://zaartix.ru/krit |
Here's a one-page test...
Spider: http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news&mach[3]=79.htm Results: Spidering in progress... [Stop spider] SITE : http://zaartix.ru/ Exclude paths : - @NONE@ 1:http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news&mach[3]=79.htm (time : 00:00:09) No link in temporary table links found : 1 http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news&mach[3]=79.htm Optimizing tables... Indexing complete ! [Back] to admin interface. |
Here's a multi-page test...
Spider: http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news.htm Results: Spidering in progress... [Stop spider] SITE : http://zaartix.ru/ Exclude paths : - @NONE@ 1:http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news.htm (time : 00:00:10) + + + + + + + + + + + + + + + + + + + + + + level 1... 2:http://zaartix.ru/krit/index.php-razdel=price&mach[2]=23.htm (time : 00:00:34) 3:http://zaartix.ru/krit/index.php-razdel=about&mach[2]=24.htm (time : 00:00:46) 4:http://zaartix.ru/krit/index.php-razdel=price.htm (time : 00:01:04) 5:http://zaartix.ru/krit/index.php-razdel=quality&mach[2]=34.htm (time : 00:01:13) 6:http://zaartix.ru/krit/index.php-razdel=contact&mach[2]=19.htm (time : 00:01:23) Duplicate of an existing document 7:http://zaartix.ru/krit/index.php-razdel=price&mach[2]=view.htm (time : 00:01:40) 8:http://zaartix.ru/krit/index.php-razdel=about&mach[2]=22.htm (time : 00:01:50) 9:http://zaartix.ru/krit/index.php-razdel=about&mach[2]=21.htm (time : 00:01:59) 10:http://zaartix.ru/krit/index.htm (time : 00:02:08) 11:http://zaartix.ru/krit/index.php-razdel=about&mach[2]=20.htm (time : 00:02:17) 12:http://zaartix.ru/krit/index.php-razdel=price&mach[2]=ost.htm (time : 00:02:25) 13:http://zaartix.ru/krit/index.php-razdel=price&mach[2]=tech.htm (time : 00:02:34) 14:http://zaartix.ru/krit/index.php-razdel=price&mach[2]=sert.htm (time : 00:02:43) 15:http://zaartix.ru/krit/index.php-razdel=quality&mach[2]=27.htm (time : 00:02:51) 16:http://zaartix.ru/krit/index.php-razdel=quality&mach[2]=32.htm (time : 00:03:00) 17:http://zaartix.ru/krit/index.php-razdel=quality&mach[2]=33.htm (time : 00:03:09) 18:http://zaartix.ru/krit/index.php-razdel=contact&mach[2]=16.htm (time : 00:03:17) 19:http://zaartix.ru/krit/index.php-razdel=contact&mach[2]=17.htm (time : 00:03:26) 20:http://zaartix.ru/krit/index.php-razdel=contact&mach[2]=vacancies.htm (time : 00:03:35) 21:http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news&mach[3]=79.htm (time : 00:03:43) 22:http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news&mach[3]=78.htm (time : 00:03:51) 23:http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news&mach[3]=2.htm (time : 00:04:01) No link in temporary table links found : 23 http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news.htm http://zaartix.ru/krit/index.php-razdel=price&mach[2]=23.htm http://zaartix.ru/krit/index.php-razdel=about&mach[2]=24.htm http://zaartix.ru/krit/index.php-razdel=price.htm http://zaartix.ru/krit/index.php-razdel=quality&mach[2]=34.htm http://zaartix.ru/krit/index.php-razdel=contact&mach[2]=19.htm http://zaartix.ru/krit/index.php-razdel=price&mach[2]=view.htm http://zaartix.ru/krit/index.php-razdel=about&mach[2]=22.htm http://zaartix.ru/krit/index.php-razdel=about&mach[2]=21.htm http://zaartix.ru/krit/index.htm http://zaartix.ru/krit/index.php-razdel=about&mach[2]=20.htm http://zaartix.ru/krit/index.php-razdel=price&mach[2]=ost.htm http://zaartix.ru/krit/index.php-razdel=price&mach[2]=tech.htm http://zaartix.ru/krit/index.php-razdel=price&mach[2]=sert.htm http://zaartix.ru/krit/index.php-razdel=quality&mach[2]=27.htm http://zaartix.ru/krit/index.php-razdel=quality&mach[2]=32.htm http://zaartix.ru/krit/index.php-razdel=quality&mach[2]=33.htm http://zaartix.ru/krit/index.php-razdel=contact&mach[2]=16.htm http://zaartix.ru/krit/index.php-razdel=contact&mach[2]=17.htm http://zaartix.ru/krit/index.php-razdel=contact&mach[2]=vacancies.htm http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news&mach[3]=79.htm http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news&mach[3]=78.htm http://zaartix.ru/krit/index.php-razdel=about&mach[2]=news&mach[3]=2.htm Optimizing tables... Indexing complete ! [Back] to admin interface. |
All times are GMT -8. The time now is 01:46 AM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.