PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 07-08-2004, 06:52 AM   #1
davenewt
Green Mole
 
davenewt's Avatar
 
Join Date: Jul 2004
Posts: 20
spidering does *nothing* ?

Hi folks,

I've installed phpdig, solved the logging-in problem with the help of your forums here, but have come across another.

Namely, that spidering my site doesn't appear to be working, throwing out any errors, or anything.

All I get is:

Code:
Spidering in progress...

--------------------------------------------------------------------------------
SITE : http://localhost
Exclude paths :
- @NONE@
(I'm running phpdig on the server to spider the site)

...and then, nothing. It just sits there. I've waited a while, wondering if my 20-page site I'm testing it with is maybe taking 5mins per page, but no. Still nothing.


What gives? Any help gratefully appreciated.

Thanks,
Dave.
davenewt is offline   Reply With Quote
Old 07-08-2004, 06:54 AM   #2
davenewt
Green Mole
 
davenewt's Avatar
 
Join Date: Jul 2004
Posts: 20
PS I've tried adding /index.php and just / to the path, but still no joy. Help!
davenewt is offline   Reply With Quote
Old 07-08-2004, 07:21 AM   #3
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. In this thread, although a bit dated, there is talk of various issues. Does anything there help?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-12-2004, 12:05 AM   #4
davenewt
Green Mole
 
davenewt's Avatar
 
Join Date: Jul 2004
Posts: 20
Thanks Charter, I've looked into both those threads, but still no joy. When I comment out the //print $answer line in robot_functions.php I get the following output. Does this shed any light for anyone here?

Code:
Spidering in progress...
HTTP/1.1 404 Not Found 
HTTP/1.1 200 OK 
Connection: close 
Date: Mon, 12 Jul 2004 07:55:07 GMT 
Server: Microsoft-IIS/6.0 
Content-type: text/html 
X-Powered-By: PHP/4.3.6 
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 07:55:06 GMT; path=/ 
Set-Cookie: phpbb2mysql_sid=39f7eac47575e95830dfc26253f49aa6; path=/ 

HTTP/1.1 200 OK 
Connection: close 
Date: Mon, 12 Jul 2004 07:55:07 GMT 
Server: Microsoft-IIS/6.0 
Content-type: text/html 
X-Powered-By: PHP/4.3.6 
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 07:55:07 GMT; path=/ 
Set-Cookie: phpbb2mysql_sid=0254fab8d05bec6fe0a3b103bd8207eb; path=/ 

HTTP/1.1 404 Not Found 

--------------------------------------------------------------------------------
SITE : http://knet/
Exclude paths :
- @NONE@
HTTP/1.1 200 OK 
Connection: close 
Date: Mon, 12 Jul 2004 07:55:12 GMT 
Server: Microsoft-IIS/6.0 
Content-type: text/html 
X-Powered-By: PHP/4.3.6 
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 07:55:12 GMT; path=/ 
Set-Cookie: phpbb2mysql_sid=22564c9ca04014e3f4dc5dfb837cd46d; path=/
Any ideas?

Thanks,
Dave.
davenewt is offline   Reply With Quote
Old 07-12-2004, 12:15 AM   #5
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Is magic_quotes_runtime On or Off?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-12-2004, 12:20 AM   #6
davenewt
Green Mole
 
davenewt's Avatar
 
Join Date: Jul 2004
Posts: 20
Off. Why?

Thanks for the quick response :-)

Dave.
davenewt is offline   Reply With Quote
Old 07-12-2004, 12:30 AM   #7
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Just a quoting bug when magic_quotes_runtime is on...

Anything showing up in your error logs?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-12-2004, 12:45 AM   #8
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
PS: Not using version 1.8.1? On Win? Set USE_IS_EXECUTABLE_COMMAND to 0 in the config.php file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-12-2004, 02:03 AM   #9
davenewt
Green Mole
 
davenewt's Avatar
 
Join Date: Jul 2004
Posts: 20
Thanks, have got a little further now. Changed USE_IS_EXECUTABLE_COMMAND to 0 as you suggested, and now get the following output:

Code:
Spidering in progress...
HTTP/1.1 404 Not Found 
HTTP/1.1 200 OK 
Connection: close 
Date: Mon, 12 Jul 2004 09:54:14 GMT 
Server: Microsoft-IIS/6.0 
Content-type: text/html 
X-Powered-By: PHP/4.3.6 
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 09:54:14 GMT; path=/ 
Set-Cookie: phpbb2mysql_sid=a4c25478069a56787cb47195438c3285; path=/ 

HTTP/1.1 200 OK 
Connection: close 
Date: Mon, 12 Jul 2004 09:54:14 GMT 
Server: Microsoft-IIS/6.0 
Content-type: text/html 
X-Powered-By: PHP/4.3.6 
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 09:54:14 GMT; path=/ 
Set-Cookie: phpbb2mysql_sid=9066b3688621a9f1c2e32fd7be8d953f; path=/ 

HTTP/1.1 404 Not Found 

--------------------------------------------------------------------------------
SITE : http://knet/
Exclude paths :
- @NONE@
HTTP/1.1 200 OK 
Connection: close 
Date: Mon, 12 Jul 2004 09:54:19 GMT 
Server: Microsoft-IIS/6.0 
Content-type: text/html 
X-Powered-By: PHP/4.3.6 
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 09:54:19 GMT; path=/ 
Set-Cookie: phpbb2mysql_sid=dd7951c2b01cc6f4830f7eb1a0dc2589; path=/ 

[tick symbol]1:http://knet/
(time : 00:00:06)
No link in temporary table

--------------------------------------------------------------------------------

links found : 1
http://knet/
Optimizing tables...
Indexing complete !
Does not seem to go any further than the index page, despite there being plenty of links to other pages.

Glad we're getting somewhere though!

What do I need to do next, to get it to spider the entire site?

Thanks,
Dave.
davenewt is offline   Reply With Quote
Old 07-12-2004, 02:11 AM   #10
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Look in the text_content directory, at the file with the highest number. What is in that file, the contents from the main page or something else? If something else, is it showing a 404 message page? If so, then add a robots.txt page to web root and see if it will go.

simple robots.txt file:

User-agent: *
Disallow:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-12-2004, 02:20 AM   #11
davenewt
Green Mole
 
davenewt's Avatar
 
Join Date: Jul 2004
Posts: 20
Hi. There's only a 1.txt file, and it's the content of the index page. This includes text which is a link in the HTML, so it's obvisously missing something...

Thanks,
Dave.
davenewt is offline   Reply With Quote
Old 07-12-2004, 02:24 AM   #12
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
What's in the file, or at least the suspicious looking piece, and how does it compare to the HTML of the index page?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-12-2004, 02:35 AM   #13
davenewt
Green Mole
 
davenewt's Avatar
 
Join Date: Jul 2004
Posts: 20
In the HTML:
Code:
<span class="xhead">Latest News <a href="/newsboard/index.php">[View Archive]</a></span>
In 1.txt:
Code:
Latest News [View Archive]
davenewt is offline   Reply With Quote
Old 07-12-2004, 03:36 AM   #14
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
That's as it should work, strip away the tags and leave the text. PhpDig looks for links prior to that. Is there anything showing in your error logs?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 07-12-2004, 06:19 AM   #15
davenewt
Green Mole
 
davenewt's Avatar
 
Join Date: Jul 2004
Posts: 20
I don't see any error log files within the phpdig directories...?
davenewt is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT -8. The time now is 09:24 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.