PDA

View Full Version : it just doesn't want to spider...


manute
04-26-2004, 05:18 AM
hi!

i just installed phpdig for a new website. unfortunately it doesn't want to spider anything at all. the url is http://www.fussball24.de
it always says: "no links found" and "has just been indexed". any ideas?

manute
04-26-2004, 05:48 AM
there's something that i should probably add:
i took the set_time_limit(86400); // 1 full day out of the spider.php because it produced an error (safe-mode).
could that be the reason?

vinyl-junkie
04-26-2004, 06:11 AM
A few possibilities come to mind. Do you have anything in your robots.txt that would prevent your site from being spidered?
What did you choose for a default search depth? If it was zero, phpDig may have just spidered the root and nothing else.
What is LIMIT_DAYS set to in your config.php file, and when is the last time you spidered your site? It's possible that phpDig won't let you re-spider because the requisite number of days have not passed since you last spidered.
If none of the above applies, consider posting your spider log, as there might be something there that would give us a clue.

manute
04-26-2004, 06:24 AM
no, it's none of that. there must be another reason. where can i find that log?
concerning the safe mode: i just installed phpdig on another server with safemode=off, so that's not the reason either.

vinyl-junkie
04-26-2004, 06:33 AM
Your spider log is the display that happens when you try to spider your site. Sometimes it helps to take a look at that. Then again, the problem could be something completely different.

manute
04-26-2004, 07:01 AM
ah, that's what you mean:

SITE : http://www.fussball24.de/
Ausgeschlossene Pfade :
-
- @NONE@
Kein Link in der temporäreren Tabelle

--------------------------------------------------------------------------------

Links gefunden : 0
...Wurde gerade indiziert
Optimizing tables...
Indizierung abgeschlossen!

i tried about 20 times, somethings really wrong with it... :(

Charter
04-26-2004, 07:34 AM
Hi. Try applying the code change in this (http://www.phpdig.net/showthread.php?postid=3022#post3022) post and fix your robots.txt file so that it reads:

User-agent: *
Disallow: /go.php

instead of just:

Disallow: /go.php

Also, PhpDig is set to fully function when safe_mode is off.

manute
04-26-2004, 07:46 AM
yeah! thanx charter, it works now. i only added the line in the robots.txt. great!

acidelic
04-26-2004, 10:29 AM
I'm encountering the same problem, but creating a robots.txt file does not seem to help.

Here is my robots.txt:

User-agent: *
Disallow: /include/

When I tell it to spider localhost or the hostname of the local server (depth=5) always results in this:

Spidering in progress...

--------------------------------------------------------------------------------
SITE : http://localhost/
Exclude paths :
- include/

There is no CPU cycles being used and the page is fully loaded. What else can I check? How will I know that the spider process is actually doing something?

Thanks