PhpDig.net

PhpDig.net (http://www.phpdig.net/forum/index.php)
-   Troubleshooting (http://www.phpdig.net/forum/forumdisplay.php?f=22)
-   -   spidering does *nothing* ? (http://www.phpdig.net/forum/showthread.php?t=1046)

davenewt 07-08-2004 06:52 AM

spidering does *nothing* ?
 
Hi folks,

I've installed phpdig, solved the logging-in problem with the help of your forums here, but have come across another.

Namely, that spidering my site doesn't appear to be working, throwing out any errors, or anything.

All I get is:

Code:

Spidering in progress...

--------------------------------------------------------------------------------
SITE : http://localhost
Exclude paths :
- @NONE@

(I'm running phpdig on the server to spider the site)

...and then, nothing. It just sits there. I've waited a while, wondering if my 20-page site I'm testing it with is maybe taking 5mins per page, but no. Still nothing.


What gives? Any help gratefully appreciated.

Thanks,
Dave.

davenewt 07-08-2004 06:54 AM

PS I've tried adding /index.php and just / to the path, but still no joy. Help! :)

Charter 07-08-2004 07:21 AM

Hi. In this thread, although a bit dated, there is talk of various issues. Does anything there help?

davenewt 07-12-2004 12:05 AM

Thanks Charter, I've looked into both those threads, but still no joy. When I comment out the //print $answer line in robot_functions.php I get the following output. Does this shed any light for anyone here?

Code:

Spidering in progress...
HTTP/1.1 404 Not Found
HTTP/1.1 200 OK
Connection: close
Date: Mon, 12 Jul 2004 07:55:07 GMT
Server: Microsoft-IIS/6.0
Content-type: text/html
X-Powered-By: PHP/4.3.6
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 07:55:06 GMT; path=/
Set-Cookie: phpbb2mysql_sid=39f7eac47575e95830dfc26253f49aa6; path=/

HTTP/1.1 200 OK
Connection: close
Date: Mon, 12 Jul 2004 07:55:07 GMT
Server: Microsoft-IIS/6.0
Content-type: text/html
X-Powered-By: PHP/4.3.6
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 07:55:07 GMT; path=/
Set-Cookie: phpbb2mysql_sid=0254fab8d05bec6fe0a3b103bd8207eb; path=/

HTTP/1.1 404 Not Found

--------------------------------------------------------------------------------
SITE : http://knet/
Exclude paths :
- @NONE@
HTTP/1.1 200 OK
Connection: close
Date: Mon, 12 Jul 2004 07:55:12 GMT
Server: Microsoft-IIS/6.0
Content-type: text/html
X-Powered-By: PHP/4.3.6
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 07:55:12 GMT; path=/
Set-Cookie: phpbb2mysql_sid=22564c9ca04014e3f4dc5dfb837cd46d; path=/

Any ideas?

Thanks,
Dave.

Charter 07-12-2004 12:15 AM

Hi. Is magic_quotes_runtime On or Off?

davenewt 07-12-2004 12:20 AM

Off. Why?

Thanks for the quick response :-)

Dave.

Charter 07-12-2004 12:30 AM

Just a quoting bug when magic_quotes_runtime is on... :(

Anything showing up in your error logs?

Charter 07-12-2004 12:45 AM

PS: Not using version 1.8.1? On Win? Set USE_IS_EXECUTABLE_COMMAND to 0 in the config.php file.

davenewt 07-12-2004 02:03 AM

Thanks, have got a little further now. Changed USE_IS_EXECUTABLE_COMMAND to 0 as you suggested, and now get the following output:

Code:

Spidering in progress...
HTTP/1.1 404 Not Found
HTTP/1.1 200 OK
Connection: close
Date: Mon, 12 Jul 2004 09:54:14 GMT
Server: Microsoft-IIS/6.0
Content-type: text/html
X-Powered-By: PHP/4.3.6
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 09:54:14 GMT; path=/
Set-Cookie: phpbb2mysql_sid=a4c25478069a56787cb47195438c3285; path=/

HTTP/1.1 200 OK
Connection: close
Date: Mon, 12 Jul 2004 09:54:14 GMT
Server: Microsoft-IIS/6.0
Content-type: text/html
X-Powered-By: PHP/4.3.6
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 09:54:14 GMT; path=/
Set-Cookie: phpbb2mysql_sid=9066b3688621a9f1c2e32fd7be8d953f; path=/

HTTP/1.1 404 Not Found

--------------------------------------------------------------------------------
SITE : http://knet/
Exclude paths :
- @NONE@
HTTP/1.1 200 OK
Connection: close
Date: Mon, 12 Jul 2004 09:54:19 GMT
Server: Microsoft-IIS/6.0
Content-type: text/html
X-Powered-By: PHP/4.3.6
Set-Cookie: phpbb2mysql_data=a%3A0%3A%7B%7D; expires=Tue, 12-Jul-2005 09:54:19 GMT; path=/
Set-Cookie: phpbb2mysql_sid=dd7951c2b01cc6f4830f7eb1a0dc2589; path=/

[tick symbol]1:http://knet/
(time : 00:00:06)
No link in temporary table

--------------------------------------------------------------------------------

links found : 1
http://knet/
Optimizing tables...
Indexing complete !

Does not seem to go any further than the index page, despite there being plenty of links to other pages.

Glad we're getting somewhere though!

What do I need to do next, to get it to spider the entire site?

Thanks,
Dave.

Charter 07-12-2004 02:11 AM

Hi. Look in the text_content directory, at the file with the highest number. What is in that file, the contents from the main page or something else? If something else, is it showing a 404 message page? If so, then add a robots.txt page to web root and see if it will go.

simple robots.txt file:

User-agent: *
Disallow:

davenewt 07-12-2004 02:20 AM

Hi. There's only a 1.txt file, and it's the content of the index page. This includes text which is a link in the HTML, so it's obvisously missing something...

Thanks,
Dave.

Charter 07-12-2004 02:24 AM

What's in the file, or at least the suspicious looking piece, and how does it compare to the HTML of the index page?

davenewt 07-12-2004 02:35 AM

In the HTML:
Code:

<span class="xhead">Latest News <a href="/newsboard/index.php">[View Archive]</a></span>
In 1.txt:
Code:

Latest News [View Archive]

Charter 07-12-2004 03:36 AM

That's as it should work, strip away the tags and leave the text. PhpDig looks for links prior to that. Is there anything showing in your error logs?

davenewt 07-12-2004 06:19 AM

I don't see any error log files within the phpdig directories...?


All times are GMT -8. The time now is 08:45 AM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.