catchme
04-14-2004, 12:07 PM
Greetings Dig Board Members.
I've just started working with dig. Overall, I am happy to find such a fine search engine tool available open source for PHP.
I run a couple of larger sites 8,000 - 80,000 pages of content, that I have interest to index with a search engine. These sites will add about 20 new pages of content per day.
I've noticed that, while possible to index these pages with dig, it can be a slow process sometimes - and also a load intensive process as well.
What I want to accomplish is - to make incremental builds of the dig database.
First, I will build the existing sites. Then afterwards, I would like to index the new files that are added to the site - perhaps every few hours.
Can someone suggest a protocol for only indexing the new files that are added recently into the site?
My thought is to write a script that collects the URIs of the new pages into a file, and then feed this to the spider.php file, when I run it via cron every few hours.
Is this a common procedure for using Dig?
thanks!
Danny
I've just started working with dig. Overall, I am happy to find such a fine search engine tool available open source for PHP.
I run a couple of larger sites 8,000 - 80,000 pages of content, that I have interest to index with a search engine. These sites will add about 20 new pages of content per day.
I've noticed that, while possible to index these pages with dig, it can be a slow process sometimes - and also a load intensive process as well.
What I want to accomplish is - to make incremental builds of the dig database.
First, I will build the existing sites. Then afterwards, I would like to index the new files that are added to the site - perhaps every few hours.
Can someone suggest a protocol for only indexing the new files that are added recently into the site?
My thought is to write a script that collects the URIs of the new pages into a file, and then feed this to the spider.php file, when I run it via cron every few hours.
Is this a common procedure for using Dig?
thanks!
Danny