|
07-08-2004, 06:52 AM | #1 |
Green Mole
Join Date: Jul 2004
Posts: 20
|
spidering does *nothing* ?
Hi folks,
I've installed phpdig, solved the logging-in problem with the help of your forums here, but have come across another. Namely, that spidering my site doesn't appear to be working, throwing out any errors, or anything. All I get is: Code:
Spidering in progress... -------------------------------------------------------------------------------- SITE : http://localhost Exclude paths : - @NONE@ ...and then, nothing. It just sits there. I've waited a while, wondering if my 20-page site I'm testing it with is maybe taking 5mins per page, but no. Still nothing. What gives? Any help gratefully appreciated. Thanks, Dave. |
07-08-2004, 06:54 AM | #2 |
Green Mole
Join Date: Jul 2004
Posts: 20
|
PS I've tried adding /index.php and just / to the path, but still no joy. Help!
|
07-08-2004, 07:21 AM | #3 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. In this thread, although a bit dated, there is talk of various issues. Does anything there help?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-12-2004, 12:05 AM | #4 |
Green Mole
Join Date: Jul 2004
Posts: 20
|
Thanks Charter, I've looked into both those threads, but still no joy. When I comment out the //print $answer line in robot_functions.php I get the following output. Does this shed any light for anyone here?
Code:
Spidering in progress... HTTP/1.1 404 Not Found HTTP/1.1 200 OK Connection: close Date: Mon, 12 Jul 2004 07:55:07 GMT Server: Microsoft-IIS/6.0 Content-type: text/html X-Powered-By: PHP/4.3.6 Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 07:55:06 GMT; path=/ Set-Cookie: phpbb2mysql_sid=39f7eac47575e95830dfc26253f49aa6; path=/ HTTP/1.1 200 OK Connection: close Date: Mon, 12 Jul 2004 07:55:07 GMT Server: Microsoft-IIS/6.0 Content-type: text/html X-Powered-By: PHP/4.3.6 Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 07:55:07 GMT; path=/ Set-Cookie: phpbb2mysql_sid=0254fab8d05bec6fe0a3b103bd8207eb; path=/ HTTP/1.1 404 Not Found -------------------------------------------------------------------------------- SITE : http://knet/ Exclude paths : - @NONE@ HTTP/1.1 200 OK Connection: close Date: Mon, 12 Jul 2004 07:55:12 GMT Server: Microsoft-IIS/6.0 Content-type: text/html X-Powered-By: PHP/4.3.6 Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 07:55:12 GMT; path=/ Set-Cookie: phpbb2mysql_sid=22564c9ca04014e3f4dc5dfb837cd46d; path=/ Thanks, Dave. |
07-12-2004, 12:15 AM | #5 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Is magic_quotes_runtime On or Off?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-12-2004, 12:20 AM | #6 |
Green Mole
Join Date: Jul 2004
Posts: 20
|
Off. Why?
Thanks for the quick response :-) Dave. |
07-12-2004, 12:30 AM | #7 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Just a quoting bug when magic_quotes_runtime is on...
Anything showing up in your error logs?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-12-2004, 12:45 AM | #8 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
PS: Not using version 1.8.1? On Win? Set USE_IS_EXECUTABLE_COMMAND to 0 in the config.php file.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-12-2004, 02:03 AM | #9 |
Green Mole
Join Date: Jul 2004
Posts: 20
|
Thanks, have got a little further now. Changed USE_IS_EXECUTABLE_COMMAND to 0 as you suggested, and now get the following output:
Code:
Spidering in progress... HTTP/1.1 404 Not Found HTTP/1.1 200 OK Connection: close Date: Mon, 12 Jul 2004 09:54:14 GMT Server: Microsoft-IIS/6.0 Content-type: text/html X-Powered-By: PHP/4.3.6 Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 09:54:14 GMT; path=/ Set-Cookie: phpbb2mysql_sid=a4c25478069a56787cb47195438c3285; path=/ HTTP/1.1 200 OK Connection: close Date: Mon, 12 Jul 2004 09:54:14 GMT Server: Microsoft-IIS/6.0 Content-type: text/html X-Powered-By: PHP/4.3.6 Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 09:54:14 GMT; path=/ Set-Cookie: phpbb2mysql_sid=9066b3688621a9f1c2e32fd7be8d953f; path=/ HTTP/1.1 404 Not Found -------------------------------------------------------------------------------- SITE : http://knet/ Exclude paths : - @NONE@ HTTP/1.1 200 OK Connection: close Date: Mon, 12 Jul 2004 09:54:19 GMT Server: Microsoft-IIS/6.0 Content-type: text/html X-Powered-By: PHP/4.3.6 Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 09:54:19 GMT; path=/ Set-Cookie: phpbb2mysql_sid=dd7951c2b01cc6f4830f7eb1a0dc2589; path=/ [tick symbol]1:http://knet/ (time : 00:00:06) No link in temporary table -------------------------------------------------------------------------------- links found : 1 http://knet/ Optimizing tables... Indexing complete ! Glad we're getting somewhere though! What do I need to do next, to get it to spider the entire site? Thanks, Dave. |
07-12-2004, 02:11 AM | #10 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Look in the text_content directory, at the file with the highest number. What is in that file, the contents from the main page or something else? If something else, is it showing a 404 message page? If so, then add a robots.txt page to web root and see if it will go.
simple robots.txt file: User-agent: * Disallow:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-12-2004, 02:20 AM | #11 |
Green Mole
Join Date: Jul 2004
Posts: 20
|
Hi. There's only a 1.txt file, and it's the content of the index page. This includes text which is a link in the HTML, so it's obvisously missing something...
Thanks, Dave. |
07-12-2004, 02:24 AM | #12 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
What's in the file, or at least the suspicious looking piece, and how does it compare to the HTML of the index page?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-12-2004, 02:35 AM | #13 |
Green Mole
Join Date: Jul 2004
Posts: 20
|
In the HTML:
Code:
<span class="xhead">Latest News <a href="/newsboard/index.php">[View Archive]</a></span> Code:
Latest News [View Archive] |
07-12-2004, 03:36 AM | #14 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
That's as it should work, strip away the tags and leave the text. PhpDig looks for links prior to that. Is there anything showing in your error logs?
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
07-12-2004, 06:19 AM | #15 |
Green Mole
Join Date: Jul 2004
Posts: 20
|
I don't see any error log files within the phpdig directories...?
|
Thread Tools | |
|
|