View Single Post
Old 03-09-2004, 02:18 PM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi airplay, and welcome to PhpDig.net!

With many pages, perhaps set the following in the config.php file, where X is one or two:
PHP Code:
define('LIMIT_DAYS',0);              //default days before reindex a page
define('SPIDER_MAX_LIMIT',X);        //max recurse levels in spider
define('SPIDER_DEFAULT_LIMIT',X);    //default value
define('RESPIDER_LIMIT',X);          //recurse limit for update 
and then crawl your site in chunks.

One thing I've noticed is that users in general tend to set the search depth to the highest possible value and then let the robot run. This tends to get a lot of repeat documents, lending to a longer index time.

Also, when you want to start over, it might be better to delete the site from the admin panel, as this will empty the tables (execpt for keywords and logs) and delete the TXT files. The clean dictionary link will clean/empty the keywords table, but it is probably faster to do it from shell, and the logs tables would need to be emptied from shell or phpMyAdmin.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote