Problem Spidering
I cannot index any sites with my install of phpDig. I have v1.8.8 RC1 on a windows box and apache. Directory permissions are already set correctly and I verified that allow_url_fopen is enabled.
I am trying to index: http://www.noland.com/noland/index.php When the spider starts, it seems to pull the parent directory www.noland.com (which is unavailable to the web as it redirects to www.noland.com/noland) When I try to spider an external site such as www.mtslink.com it will not work either. Here is the output that I get: Spidering in progress... [Stop spider] -------------------------------------------------------------------------------- SITE : http://www.noland.com/ Exclude paths : - Admin/ - auctiondata/ - calendar/ - cgi-local/ - enoland/ - itemmaint/ - mail/ - msds/ - nol****nline/ - nolandtest/ - obis/ - Orders/ - phpinc/ - squidalizer/ - Stylesheets/ - test/ - webmail/ - webalizer/ - squidalizer-detail/ Wait... 1:http://www.noland.com/noland/ (time : 00:00:05) No link in temporary table -------------------------------------------------------------------------------- links found : 1 http://www.noland.com/noland/ Optimizing tables... Indexing complete ! |
Is your site and the PhpDig install on a server that uses load balancing?
|
No... I was able to get it to spider individual pages just fine by playing with the config, but it doesn't seem to want to follow any links no matter what I try.
|
Try setting PHPDIG_IN_DOMAIN to true, LIMIT_TO_DIRECTORY to false, both in the config file, and then from the admin panel, use a large search depth, set links per to zero, and choose the no option. You can increase search depth beyond twenty by editing SPIDER_MAX_LIMIT in the config file.
|
Done
Ok, I verified those 2 settings and I'm still able to get a single page indexed, but it will not follow any of the links. I'd be happy to provide you with the login information (via e-mail) if you think that would help to diagnose the problem.
Thanks for your help. John |
Your install gives:
Code:
Spidering in progress... [Stop spider] Code:
Spidering in progress... [Stop spider] |
Set your PHP display_errors to on and keep error_reporting(E_ALL); in the config file. With display_errors to off, error_reporting does not show anything onscreen. If you don't want to do this in PHP directly, try setting the following in an htaccess file in the main PhpDig directory and then do an index:
Code:
PHP_VALUE display_errors 1 Code:
SELECT VERSION(); |
All times are GMT -8. The time now is 04:28 AM. |
Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.