PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 01-02-2005, 12:46 PM   #1
@Cramac
Green Mole
 
Join Date: Jan 2005
Posts: 7
А есть суппорт на русском?

Всем привет. Собственно сабж.
@Cramac is offline   Reply With Quote
Old 01-02-2005, 01:53 PM   #2
@Cramac
Green Mole
 
Join Date: Jan 2005
Posts: 7
All greetings. At once I am sorry for bad English (PromtXT)
At me such problem.
Has established search, has corrected in configs the data on access to base. Has tried to go but at input of the password and a login, it{he} did not let, threw back on the form. Has disconnected authorization. Has come. But it is impossible on normal about to index all site (www.Elgorsk.ru)
That writes at attempt about to index (repeatedly) the basic site:

SITE : http://elgorsk.ru/
Exclude paths :
- @NONE@
Duplicate of an existing document
1:http://elgorsk.ru/
(time : 00:00:06)

No link in temporary table

--------------------------------------------------------------------------------

links found : 1
http://elgorsk.ru/
Optimizing tables...
Indexing complete !


And on it business rises.
Tried to index a forum (forum.elgorsk.ru)
That it{he} has found a maximum, it about 60-70 pages... But there it is more than them!

Help the beginner.
@Cramac is offline   Reply With Quote
Old 01-02-2005, 09:16 PM   #3
vinyl-junkie
Purple Mole
 
Join Date: Jan 2004
Posts: 694
Pay particular attention to post #6 in this thread:
http://www.phpdig.net/forum/showthread.php?t=1692

vinyl-junkie is offline   Reply With Quote
Old 01-02-2005, 11:14 PM   #4
@Cramac
Green Mole
 
Join Date: Jan 2005
Posts: 7
Tried in all variants and helps nothing.
@Cramac is offline   Reply With Quote
Old 01-02-2005, 11:32 PM   #5
vinyl-junkie
Purple Mole
 
Join Date: Jan 2004
Posts: 694
When I tried indexing your site just now, I was able to index 3 pages before I stopped the spider, but I indexed it as:

http://www.elgorsk.ru/

not as:

http://elgorsk.ru/

That's one problem.

Another thing I noticed is that you have a lot of subdirectories on your site. Phpdig interprets those as separate domains and will not index them as part of the process of indexing your main domain. You'll have to list those on separate lines on the admin page for phpdig to spider them.
vinyl-junkie is offline   Reply With Quote
Old 01-03-2005, 06:04 AM   #6
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Quote:
Originally Posted by vinyl-junkie
Another thing I noticed is that you have a lot of subdirectories on your site. Phpdig interprets those as separate domains and will not index them as part of the process of indexing your main domain. You'll have to list those on separate lines on the admin page for phpdig to spider them.
Not quite...

If LIMIT_TO_DIRECTORY is true then the index is limited to the given (sub)directory. The dropdown on the search box, assuming it's enabled via the config file, is for searching within a (sub)domain/(sub)directory. If interested, see this for some distinction between (sub)direcories and (sub)domains.

PHP Code:
//for limit to directory, URL format must either have file at end or ending slash at end
//e.g., http://www.domain.com/dirs/ (WITH ending slash) or http://www.domain.com/dirs/dirs/index.php
define('LIMIT_TO_DIRECTORY',true);      //limit index to given (sub)directory, no sub dirs of dirs are indexed 
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 01-03-2005, 08:41 AM   #7
vinyl-junkie
Purple Mole
 
Join Date: Jan 2004
Posts: 694
Welcome back, Charter! Hope you had at least a semi-restful time away from the forums (and that you got your furnace fixed. It's cold here in the US!).

Thanks for the correction on the (sub)directory issue. I guess I had been giving out a bit of misinformation. I'll file this one away for future reference.
vinyl-junkie is offline   Reply With Quote
Old 01-03-2005, 02:45 PM   #8
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Thanks, my time away was filled with sadness and reflection, but I did get my furnace fixed. It had dropped to about 50 °F (10 °C) before the heat came back.

Anyway, @Cramac should try setting LIMIT_TO_DIRECTORY to false and PHPDIG_IN_DOMAIN to true, both in the config file.

PHP Code:
define('PHPDIG_IN_DOMAIN',false);            //allows phpdig jump hosts in the same
                                             //domain. If the host is "www.mydomain.tld",
                                             //domain is "mydomain.tld" 
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 01-03-2005, 10:14 PM   #9
@Cramac
Green Mole
 
Join Date: Jan 2005
Posts: 7
Thanks. I shall try once again.

P.S.I have tried yesterday to start a spider through crones so then the hosting - provider called to me and have told that I strongly him{it} load a server.
@Cramac is offline   Reply With Quote
Old 01-04-2005, 01:24 AM   #10
@Cramac
Green Mole
 
Join Date: Jan 2005
Posts: 7
And still a question: whether it is possible to forbid indexation (temporarily) pages already proindexed?
@Cramac is offline   Reply With Quote
Old 01-04-2005, 03:19 AM   #11
vinyl-junkie
Purple Mole
 
Join Date: Jan 2004
Posts: 694
Quote:
Originally Posted by @Cramac
Thanks. I shall try once again.

P.S.I have tried yesterday to start a spider through crones so then the hosting - provider called to me and have told that I strongly him{it} load a server.
How did you set it up to run? Unless you set off multiple spiders at once, running phpdig through cron shouldn't create a large server load.
vinyl-junkie is offline   Reply With Quote
Old 01-04-2005, 11:47 AM   #12
@Cramac
Green Mole
 
Join Date: Jan 2005
Posts: 7
Like indexed all that could, it has turned out about 1000 pages. And where the others?
According to search machines yandex.ru at me on a site over 22000 pages....
@Cramac is offline   Reply With Quote
Old 01-05-2005, 01:23 AM   #13
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
In the config file set the following:
PHP Code:
define('SPIDER_MAX_LIMIT',100);
define('RESPIDER_LIMIT',100);
define('LINKS_MAX_LIMIT',100);
define('RELINKS_LIMIT',100);
define('LIMIT_TO_DIRECTORY',false);
define('PHPDIG_IN_DOMAIN',true); 
From the PhpDig admin panel use the following:
  • set "search depth" to 100
  • set "links per" to zero
  • use "no" option this time
  • click the dig button
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 01-06-2005, 02:08 AM   #14
@Cramac
Green Mole
 
Join Date: Jan 2005
Posts: 7
Whether and it is possible to start indexation for all site not from the beginning, and during unfinished? I.e. if I shall casually stop the robot in the middle of work as it to start from a place of end instead of is constant from the beginning?
@Cramac is offline   Reply With Quote
Old 01-06-2005, 03:23 AM   #15
vinyl-junkie
Purple Mole
 
Join Date: Jan 2004
Posts: 694
If you stop spidering in the middle, just don't get rid of the contents of your tempspider table. That's what phpdig uses to figure out where it's been.

Charter, please correct me if I didn't quite explain that correctly, but that's my understanding of how it works.
vinyl-junkie is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT -8. The time now is 12:31 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.