PDA

View Full Version : "Update One Page" Updating The Whole Database


vinyl-junkie
08-08-2004, 09:11 AM
I made some changes to one page on my website today, so I went through the procedure to re-index just the one page.

From the admin page, I clicked on the website link, then Update Form. From there I clicked on the right arrow to get a list of files that are indexed for that site. I found the file I wanted to update, and clicked the green check mark beside it, which should just re-index that one page. The spider is currently crawling the whole site, not just the one page! :(

Charter
08-08-2004, 11:04 AM
Hi. I'm not sure this classifies as a bug or rather unexpected behavior.

If you change one page, should the spider just index that one page but not follow links, even if the links changed? I guess I can see it both ways. Maybe it should only reindex the one page regardless of links.

Anyway, if you want to crawl just one page, type the URI in the text box and set search depth to zero and links per to one from the main page of the admin panel.

If search depth and/or links per are not set, such as in the admin panel subpages or from shell, then the following constants are considered:

define('SPIDER_MAX_LIMIT',20);
define('RESPIDER_LIMIT',5);
define('LINKS_MAX_LIMIT',20);
define('RELINKS_LIMIT',5);

If there is no 'Links' value set for the site on the 'Update sites' page of the admin control panel, these could be set to zero, zero, one, one respectively to only crawl one page.

If there is a 'Links' value set for the site on the 'Update sites' page of the admin control panel, and that value is one, these could be set to zero, zero, whatever, whatever respectively to only crawl one page.

If you want to stop the spider in a nice way, open a new browser and keep clicking the delete button, without selecting a site, until the spider stops. Once the tempspider table stays empty, the spider should stop.

vinyl-junkie
08-08-2004, 01:23 PM
The first thing you suggested, selecting a search depth of zero, and links-per to one, did just what I wanted it to do - index just the one page without trying to crawl anything else.

Always wanting to play devil's advocate (and maybe this doesn't work exactly the same way), I changed the config.php file as you suggested, then went through my original process to get to the one page to index. That still tried to crawl the whole site, so I don't think that would work. I don't have shell access so I can't try it that way.

A somewhat related question though - When I get the right arrow to get the list of indexed pages, will click the red "x" beside one of those pages exclude it from future indexing?

Incidentally, the reason I posted this whole thing under the Bug Tracker sub-forum is that under 1.8.0 (if I'm remembering the correct version number of phpdig), you didn't have to do anything special to index just one page. Maybe the reason for the change is because we have a links-per selection now, where 1.8.0 didn't have that?

Charter
08-08-2004, 04:02 PM
>> Always wanting to play devil's advocate (and maybe this doesn't work exactly the same way), I changed the config.php file as you suggested, then...

Hi, you need to make sure 'Links' is set to one for that site on the 'Update sites' page of the admin panel.

>> When I get the right arrow to get the list of indexed pages, will click the red "x" beside one of those pages exclude it from future indexing?

No, it just deletes it.

>> Maybe the reason for the change is because we have a links-per selection now, where 1.8.0 didn't have that?

Yes, things got changed.