PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 10-30-2005, 06:01 AM   #1
Trallis
Green Mole
 
Join Date: Oct 2005
Posts: 3
Problem Spidering

I cannot index any sites with my install of phpDig. I have v1.8.8 RC1 on a windows box and apache. Directory permissions are already set correctly and I verified that allow_url_fopen is enabled.

I am trying to index: http://www.noland.com/noland/index.php

When the spider starts, it seems to pull the parent directory www.noland.com (which is unavailable to the web as it redirects to www.noland.com/noland)

When I try to spider an external site such as www.mtslink.com it will not work either.

Here is the output that I get:

Spidering in progress... [Stop spider]

--------------------------------------------------------------------------------
SITE : http://www.noland.com/
Exclude paths :
- Admin/
- auctiondata/
- calendar/
- cgi-local/
- enoland/
- itemmaint/
- mail/
- msds/
- nol****nline/
- nolandtest/
- obis/
- Orders/
- phpinc/
- squidalizer/
- Stylesheets/
- test/
- webmail/
- webalizer/
- squidalizer-detail/

Wait...
1:http://www.noland.com/noland/
(time : 00:00:05)
No link in temporary table

--------------------------------------------------------------------------------

links found : 1
http://www.noland.com/noland/
Optimizing tables...
Indexing complete !
Trallis is offline   Reply With Quote
Old 11-01-2005, 02:52 PM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Is your site and the PhpDig install on a server that uses load balancing?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 11-01-2005, 05:25 PM   #3
Trallis
Green Mole
 
Join Date: Oct 2005
Posts: 3
No... I was able to get it to spider individual pages just fine by playing with the config, but it doesn't seem to want to follow any links no matter what I try.
Trallis is offline   Reply With Quote
Old 11-01-2005, 05:40 PM   #4
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Try setting PHPDIG_IN_DOMAIN to true, LIMIT_TO_DIRECTORY to false, both in the config file, and then from the admin panel, use a large search depth, set links per to zero, and choose the no option. You can increase search depth beyond twenty by editing SPIDER_MAX_LIMIT in the config file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 11-01-2005, 05:46 PM   #5
Trallis
Green Mole
 
Join Date: Oct 2005
Posts: 3
Done

Ok, I verified those 2 settings and I'm still able to get a single page indexed, but it will not follow any of the links. I'd be happy to provide you with the login information (via e-mail) if you think that would help to diagnose the problem.

Thanks for your help.

John
Trallis is offline   Reply With Quote
Old 11-01-2005, 05:49 PM   #6
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Your install gives:
Code:
Spidering in progress... [Stop spider]
SITE : http://www.mtslink.com/
Exclude paths :
- @NONE@

Wait...
1:http://www.mtslink.com/
(time : 00:00:36)
+ + + + + + + + + +
No link in temporary table
links found : 1
http://www.mtslink.com/
Optimizing tables...
Indexing complete ! [Back] to admin interface.
My install gives:
Code:
Spidering in progress... [Stop spider]
SITE : http://www.mtslink.com/
Exclude paths :
- @NONE@

Wait...
1:http://www.mtslink.com/
(time : 00:00:12)
+ + + + + + + + + +
level 1...

Wait...
2:http://www.mtslink.com/pricing.php
(time : 00:00:29)
+ + + + + +

Wait...
3:http://www.mtslink.com/medicalintranet.php
(time : 00:00:39)
+

Wait...
4:http://www.mtslink.com/contact.php
(time : 00:00:47)

Wait...
5:http://www.mtslink.com/ann.php
(time : 00:00:56)
+ 

And so forth...
Hmm, what version of PHP are you using?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 11-02-2005, 07:58 AM   #7
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Set your PHP display_errors to on and keep error_reporting(E_ALL); in the config file. With display_errors to off, error_reporting does not show anything onscreen. If you don't want to do this in PHP directly, try setting the following in an htaccess file in the main PhpDig directory and then do an index:
Code:
PHP_VALUE display_errors 1
Also, your PHP info page says that your MySQL Client API version is 4.0.25 but PhpDig 1.8.8 RC1 needs MySQL 4.1.7+ as the version. The PhpDig 1.8.8 RC1 requirements are listed here. Sometimes the PHP reported API is not the 'real' version (see here as to probable reason) so run the following MySQL query:
Code:
SELECT VERSION();
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Spidering problem mark40 Troubleshooting 1 08-28-2007 04:06 AM
Problem with spidering tomjed Troubleshooting 0 02-09-2006 02:50 AM
Spidering problem please help KaZ Troubleshooting 1 12-05-2005 06:59 AM
Problem Spidering jmitchell Troubleshooting 3 12-29-2004 05:42 PM
spidering problem nathansc How-to Forum 3 06-17-2004 03:25 PM


All times are GMT -8. The time now is 01:39 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.