PDA

View Full Version : speed of search and filter out double results


marb
03-25-2004, 11:23 PM
Hi,
I'm running 3 versions of PHPDIG 180 in different dirs.
All versions use the same DB, work good!
I can spider more than one URL at the same time, work good also.
If I do a search on the same time(with spidering) the speed of search is sometimes ferry slow,
and my server go slow. Depend of the results I get of the query.
Results 1-10, 1475 total, on "vermeer" (25.41 seconds)

Sometimes I have also a lot of double results of the same url.

I'm running regular "clean index" I'm a bit careful to run "clean directory".
I dont no of I lose results that I want to keep.

How can I speed up this a bit?
And how can I filter of double results?


Marten :)

Charter
03-29-2004, 11:38 AM
Hi. Searching while spidering can be slow, as spidering, especially with three running, can be intense. Perhaps this (http://www.phpdig.net/showthread.php?threadid=369) thread might help. Also, for double results, PhpDig doesn't specifically account for multithreading (http://www.google.com/search?q=what+is+multithreading) issues. Perhaps try adding a column to the tempspider table indictaing whether a link has been grabbed and modify the code accordingly.