Hi All,
I've gone up and done these forums, and I am a bit confused (nothing new!)
I want to create a search engine for a niche market. There may several thousand sites in this niche.
1) I want to start by listing a few big ones...
2) have the ability for the people to come and request to be indexed
(the process would have a screen for them to list their URL, and after approval, they would get indexed based on some sort of schedule)
So some questions:
a) How is the data stored?
- Does the PhpDig store the URL, Title, Description, Keywords of the "crawled" pages in the mySQL database?
- Where are the actual INDEXED content of the pages stored?
b) How much storage is needed?
- i.e. if we have 1000 sites, with 15 pages each... a total of 15,000 pages, How much storage would be needed?
c) How quick is the code?
- Using the above example (15,000 pages), how long would a 2 word search take?
d) And most importantly, has someone put together a Moded version for this kind of application?
thanks,
Sam