PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   The Mole Hole (http://www.phpdig.net/forum/forumdisplay.php?f=17)
-   -   hello / directories / phpdig & others (http://www.phpdig.net/forum/showthread.php?t=1726)

frak 01-11-2005 02:19 AM

hello / directories / phpdig & others
 
Hi,
I've been a fan of phpDig for a long time now. I have had it installed to test on a small scale about a year ago.

I've got the the point where I would like to do a large search engine. I'm a bit concerned at a few poeple talking about effective size limits of 35k-70k indexed pages (slow search performance)

I would be looking at an index larger than that. Is this something that php-dig can index quickly enough? (ie non-instant percieved results would not be good enough)

Or am I beter off going with something like mnoGoSearch? (depending on the answer I will be setting up an indexer here on my dev server this week to give it a good test thrashing)

I am looking at doing something interesting with whatever I end up going with - I'll post details on exactly what later...

Also - can somebody recommend Directory software (ala ODP)?

Cheers,
Mathew

Charter 01-11-2005 04:59 AM

As I have not tried mnoGoSearch, I cannot give you any comparison information. If you wish to make a large scale search engine, then you should consider that you'll probably need a cluster of servers to process requests. Also, you'll probably want to run precompiled code rather than parse code on each run, utilize a caching system, send compressed output, etcetera. Having a server and a script is not enough to go large scale. As for a script directory, there used to be something called "PHP Script Index" but I'm not sure if it's still available.

frak 01-12-2005 10:38 PM

thanks - one more question
 
Charter,
You have confirmed what I suspected from my own research. Pity.

It would seem that the best performance - a few mill pages indexed at <2 sec - is acheived with DataPark followed by mnogo.

I do have a question though - it seems like a alot of the "grunt" work for SEs is done by script/bins outside of the DB, instead of the database server? I had thought that the DB would do the hard work.

Why is that?

Cheers,
Mathew

Charter 01-12-2005 11:18 PM

Maybe this thread can answer your DB question, at least WRT PhpDig.


All times are GMT -8. The time now is 07:10 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.