PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 07-12-2004, 04:24 AM   #1
Niall Fernie
Green Mole
 
Join Date: Jul 2004
Location: Caithness, Scotland
Posts: 3
Question Spidering **VERY** Slow

Hi, first post here as this is the first problem I've come across. Been using 1.4x until now and decided to go for the new version (shiney new features were to much to resist)

Now, though, I find that my site is impossible to spider. I takes for ever, or more like 4.5 seconds per page. I have no access to shell to see if this would be any faster, but 1.4x used to spider the site in about 10 minutes (13-1400 pages) but at 4.5 secs per page, I cannot get it to finish.

Now I have the time to waste, and on several occasions I've left the spider spidering for around 2 hours but for one reason or another my browser ends up saying "done" long before the spider process in finished and the spider page doesn't show the list of pages it found.

Is there something new thats causing things to take soooooo long. I will have to check with my hosts for any server info you might need to help and would like to offer my thinks in advance.

(if this has already been covered in another thread, please delete this and pm me the link)
__________________
If voting could really change anything, it would be illegal!!!
Niall Fernie is offline   Reply With Quote
Old 07-12-2004, 04:27 AM   #2
allergie
Green Mole
 
Join Date: Mar 2004
Posts: 14
Ten minutes for 1400 pages? Ouch... I've never had that!

For indexing the first time a 1900 pages website it needed 4 hours on my own webserver.

I'm surprised that it should have been possible to do it quickly.
allergie is offline   Reply With Quote
Old 07-12-2004, 07:55 AM   #3
Niall Fernie
Green Mole
 
Join Date: Jul 2004
Location: Caithness, Scotland
Posts: 3
update... (should I just shut up and stop whining?)

Finally finished!!! w00t!!!

Grabed the end of the spidering to illustrate the delay between pages.

Code:
Meta Robots = NoIndex, or already indexed : No content indexed
2495:http://www.caithness-business.co.uk/...nt.php?id=1012
(time : 04:01:52)

Meta Robots = NoIndex, or already indexed : No content indexed
2496:http://www.caithness-business.co.uk/...nt.php?id=1013
(time : 04:01:57)

Meta Robots = NoIndex, or already indexed : No content indexed
2497:http://www.caithness-business.co.uk/...nt.php?id=1064
(time : 04:02:03)

No link in temporary table

links found : 1247
From the last post it would seem normal? Has some kind of delay been added since the old version? If so, is there some way to adjust this so that those of us that have to wait for a "4 hour webpage" can mabee trim a little of that time off.
__________________
If voting could really change anything, it would be illegal!!!
Niall Fernie is offline   Reply With Quote
Old 07-12-2004, 09:02 AM   #4
bloodjelly
Purple Mole
 
Join Date: Dec 2003
Posts: 106
Hi Niall -

There is a bit of "sleep" code in spider.php that prevents phpDig from requesting pages too quickly from web hosts. You can find this line and change the sleep time of 5 seconds between links to your choosing:
PHP Code:
 // Spidering ...
while($level <= $limit) {
     
sleep(5); 
This will help make spidering faster, as long as the site your spidering doesn't mind the increased load.
bloodjelly is offline   Reply With Quote
Old 07-13-2004, 12:45 AM   #5
Niall Fernie
Green Mole
 
Join Date: Jul 2004
Location: Caithness, Scotland
Posts: 3
Brilliant!!!

Thanks for that.

Will check the stats to see when the server is quiet before I make it busy
__________________
If voting could really change anything, it would be illegal!!!
Niall Fernie is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Slow indexing Alex-FBTech Troubleshooting 0 02-27-2008 05:30 AM
Fix for slow spidering in PhpDig 1.8.x vital Bug Tracker 3 11-06-2004 10:33 AM
Indexing slow.... no, _really_ slow bluntman Troubleshooting 1 09-24-2004 01:23 PM
speciffically slow spidering at fgets() slintz Troubleshooting 7 08-18-2004 02:24 AM
Very Slow Indexing airplay Troubleshooting 2 03-09-2004 02:20 PM


All times are GMT -8. The time now is 03:28 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.