PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > How-to Forum

Reply
 
Thread Tools
Old 10-30-2005, 04:48 PM   #1
noel
Orange Mole
 
Join Date: Aug 2005
Posts: 44
Talking Informations to customize spider

Hello CHARTER,

I would like to know what's your opinion about these questions :


1°) I think indexed +- 2500 sites, do you think it is realist or it isn't possible with PHPDIG ?

2°) If it is possible, how many days will you put, first to reindex a site, me I would put 30 days and you ?

3°) In order to do something "realist", what do you think, with

Number Level Dept : 20
Link for each DEPT : 50 ?? much or less ?
I tried to illimited link but it took too much time to index.

4°) A problem I don't find the answer, when it is spidering, crawling, can I put a new link ,or have I to wait that it stops crawling ?


5°) Is ist possible to have more than one spider whit shell command , what I have to do ?

6°) I have a big problem when he is spidering forums, he always find 100 links yet indexed, one link new, after 100 links yet indexed, one link new...etc... what can I do for that, it spend a lot of time for nothing ?

7°) When using shell command are all the informations in the config file are using by shell spider ? sorry for my english



Thank you !

Noël
noel is offline   Reply With Quote
Old 11-01-2005, 03:35 PM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Quote:
1°) I think indexed +- 2500 sites, do you think it is realist or it isn't possible with PHPDIG ?
I have not had 2500 sites indexed at one time, but check out this thread for some numbers.

Quote:
2°) If it is possible, how many days will you put, first to reindex a site, me I would put 30 days and you ?
For the online demo, I leave LIMIT_DAYS at zero, but for a 'real' site I think 30 days is fine. As the number of sites grows, you'll of course want to consider what and when to index.

Quote:
3°) In order to do something "realist", what do you think, with

Number Level Dept : 20
Link for each DEPT : 50 ?? much or less ?
I tried to illimited link but it took too much time to index.
The maximum pages found per site is ((depth * links) + 1) when links is greater than zero, so just think about how many pages per site you would like to find, and then set depth and links accordingly.

Quote:
4°) A problem I don't find the answer, when it is spidering, crawling, can I put a new link ,or have I to wait that it stops crawling ?
It would be better to wait until the crawling is complete, as PhpDig locks when indexing to let you know it is busy.

Quote:
5°) Is ist possible to have more than one spider whit shell command , what I have to do ?
Having more than one spider at a time would still use the same tables and slow the process down, but there is a thread here about multiple spiders.

Quote:
6°) I have a big problem when he is spidering forums, he always find 100 links yet indexed, one link new, after 100 links yet indexed, one link new...etc... what can I do for that, it spend a lot of time for nothing ?
Does 'duplicate of an existing document' appear onscreen? If so, use PHPDIG_SESSID_VAR in the config file, especially for links that contain session IDs.

Quote:
7°) When using shell command are all the informations in the config file are using by shell spider ?
All of the index related settings in the config file are used when indexing from shell, except for RESPIDER_LIMIT and RELINKS_LIMIT and maybe a couple of others.

BTW, your English is fine.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Where to customize results? ccondo Troubleshooting 1 08-03-2005 11:41 AM
how to customize search page phillystyle123 How-to Forum 1 02-21-2005 08:55 PM
can i customize the search engine in this way? warrence How-to Forum 1 12-01-2004 02:19 AM


All times are GMT -8. The time now is 10:49 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.