PhpDig.net

Go Back   PhpDig.net > PhpDig Forums > Troubleshooting

Reply
 
Thread Tools
Old 06-18-2004, 12:44 PM   #1
Destroyer X
Green Mole
 
Join Date: Jun 2004
Location: Oklahoma, U.S.A.
Posts: 19
PhpDig Ignoring Something in robots.txt

Hi everyone! As I'm trying to configure PhpDig for my own needs on my Web site, I created a robots.txt so PhpDig, and other search engines for that matter, will ignore certain folders. Here's what my robots.txt file looks like:

# robots.txt for http://www.destroyerx.net/

User-agent: *
Disallow: /cgi-bin
Disallow: /chris
Disallow: /errors
Disallow: /forum
Disallow: /images
Disallow: /poll
Disallow: /search
Disallow: /screenshots
Disallow: /stats
Disallow: /Templates
Disallow: /thumbs
Disallow: /formerror.php
Disallow: /formmail.php

Anyway, I managed to run the spider, and while it ignored almost all the folders and files I specified, it indexed my Error 404 page (a file in a folder I specified not to index). Here's what it says below:

level 1...
2:http://www.destroyerx.net/errors/404.php
(time : 00:00:16)

level 3...
Duplicate of an existing document
15:http://www.destroyerx.net/errors/404.php
(time : 00:01:52)

Duplicate of an existing document
26:http://www.destroyerx.net/errors/404.php
(time : 00:03:09)

etc., etc. etc.......

While it didn't index my 400.php, 401.php, 403.php, and 500.php in my errors folder, it did index my 404.php error page. Now, is there something wrong with the syntax of my robots.txt page for it to index that error page and somehow not index the others. Thanks everyone for your time. Ciao for now!
__________________
Visit the Destroyer X Network at http://www.destroyerx.net/
Destroyer X is offline   Reply With Quote
Old 06-18-2004, 01:06 PM   #2
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Hi. Maybe there is a bad link? If not, you can delete the 404.php page from the search in the admin panel. Otherwise, maybe try replacing the phpdigReadRobotsTxt function in robot_functions.php with the function contained in the zip found here.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote
Old 06-18-2004, 01:57 PM   #3
Destroyer X
Green Mole
 
Join Date: Jun 2004
Location: Oklahoma, U.S.A.
Posts: 19
Well, I don't know why it keeps indexing my 404.php page, but I removed my "errors" folder from being indexed. Anyway, thanks for everyone's help.

Ciao for now!
__________________
Visit the Destroyer X Network at http://www.destroyerx.net/
Destroyer X is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Command line spider ignoring "filelist.txt" lighthouse Troubleshooting 9 08-18-2004 07:35 AM
robots.txt versus robotsxx.txt Charter IPs, SEs, & UAs 0 03-11-2004 06:00 PM
robots.txt ignored roy Troubleshooting 3 02-20-2004 08:02 PM
robots.txt renehaentjens Troubleshooting 3 12-05-2003 02:40 PM
phpDig ignores robots.txt Dragonfly Troubleshooting 1 09-12-2003 06:54 AM


All times are GMT -8. The time now is 04:21 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright © 2001 - 2005, ThinkDing LLC. All Rights Reserved.