Quote:
1°) I think indexed +- 2500 sites, do you think it is realist or it isn't possible with PHPDIG ?
|
I have not had 2500 sites indexed at one time, but check out
this thread for some numbers.
Quote:
2°) If it is possible, how many days will you put, first to reindex a site, me I would put 30 days and you ?
|
For the online demo, I leave LIMIT_DAYS at zero, but for a 'real' site I think 30 days is fine. As the number of sites grows, you'll of course want to consider what and when to index.
Quote:
3°) In order to do something "realist", what do you think, with
Number Level Dept : 20
Link for each DEPT : 50 ?? much or less ?
I tried to illimited link but it took too much time to index.
|
The maximum pages found per site is ((depth * links) + 1) when links is greater than zero, so just think about how many pages per site you would like to find, and then set depth and links accordingly.
Quote:
4°) A problem I don't find the answer, when it is spidering, crawling, can I put a new link ,or have I to wait that it stops crawling ?
|
It would be better to wait until the crawling is complete, as PhpDig locks when indexing to let you know it is busy.
Quote:
5°) Is ist possible to have more than one spider whit shell command , what I have to do ?
|
Having more than one spider at a time would still use the same tables and slow the process down, but there is a thread
here about multiple spiders.
Quote:
6°) I have a big problem when he is spidering forums, he always find 100 links yet indexed, one link new, after 100 links yet indexed, one link new...etc... what can I do for that, it spend a lot of time for nothing ?
|
Does 'duplicate of an existing document' appear onscreen? If so, use PHPDIG_SESSID_VAR in the config file, especially for links that contain session IDs.
Quote:
7°) When using shell command are all the informations in the config file are using by shell spider ?
|
All of the index related settings in the config file are used when indexing from shell, except for RESPIDER_LIMIT and RELINKS_LIMIT and maybe a couple of others.
BTW, your English is fine.