PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Mod Requests (http://www.phpdig.net/forum/forumdisplay.php?f=23)
-   -   pages number limited indexing (http://www.phpdig.net/forum/showthread.php?t=300)

JÿGius³ 12-16-2003 07:28 AM

pages number limited indexing
 
Hi people.

When I index a web site I'd like to limit the max number of pages to index per site :bang: .
For example I would index only 20 pages on site A, 100 on site B and so on.
This can be useful to limit indexing of huge web sites. Do you agree?

Best regards.

JÿGius³:)

Rolandks 12-16-2003 09:56 AM

Sorry, i don't agree this - for what ? The user search words and the word is on page 21 - but this is not index.

Why would you index parts of a Site limit by Pages ?

-Roland-

Charter 12-16-2003 10:14 AM

Hi. I haven't tested the below but what it should do is limit the number of links found per page to a max of 20, where each indexed page will only have a max of 20 links to follow. This is a per page rather than per site adjustment, so if you want to have a max of 100 links for one site, then you'll need to adjust the below added line and/or set a different search depth level.

In spider.php find the following:
PHP Code:

if (isset($urls) && is_array($urls)) { 

and right after it place the following:
PHP Code:

$my_spider_limit 20;
if(
count($urls) > $my_spider_limit) {
   
$urls array_slice($urls0$my_spider_limit);


You might be able to achieve similar results without making the above change by setting the search depth level to one. When the search depth level is one, only the page and links from that page are indexed. Of course this depends on how many links are in the page, so if you use the above code, you should be able to limit the links found on any given page to the first $my_spider_limit links.

bloodjelly 01-08-2004 05:56 PM

Another thing that seems to be working for me, and limits the total number of linked pages written the the database is to find the line:
PHP Code:

 while($level <= $limit

(line 250) and change it to
PHP Code:

while($level <= $limit && count($links_found) <= 200

where 200 is the number of links you want written. The site might stay locked this way, though, and you'd need to move the unlock code (line 588).

bloodjelly 01-13-2004 11:17 AM

Whoops this works for finding 250 links total, but if you want 250 links per site you have to reset $links_found array. So, after this:
PHP Code:

if (!$n_links && $delay_message) {
     print 
$delay_message;} 

add this:
PHP Code:

unset($links_found); 
$links_found = array(); 

Also, the site doesn't stay locked, just in my particular case in a glitch, and the <= should be changed to a < or you'll get 201 pages found.


All times are GMT -8. The time now is 02:46 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.