View Single Post
Old 06-18-2004, 12:44 PM   #1
Destroyer X
Green Mole
 
Join Date: Jun 2004
Location: Oklahoma, U.S.A.
Posts: 19
PhpDig Ignoring Something in robots.txt

Hi everyone! As I'm trying to configure PhpDig for my own needs on my Web site, I created a robots.txt so PhpDig, and other search engines for that matter, will ignore certain folders. Here's what my robots.txt file looks like:

# robots.txt for http://www.destroyerx.net/

User-agent: *
Disallow: /cgi-bin
Disallow: /chris
Disallow: /errors
Disallow: /forum
Disallow: /images
Disallow: /poll
Disallow: /search
Disallow: /screenshots
Disallow: /stats
Disallow: /Templates
Disallow: /thumbs
Disallow: /formerror.php
Disallow: /formmail.php

Anyway, I managed to run the spider, and while it ignored almost all the folders and files I specified, it indexed my Error 404 page (a file in a folder I specified not to index). Here's what it says below:

level 1...
2:http://www.destroyerx.net/errors/404.php
(time : 00:00:16)

level 3...
Duplicate of an existing document
15:http://www.destroyerx.net/errors/404.php
(time : 00:01:52)

Duplicate of an existing document
26:http://www.destroyerx.net/errors/404.php
(time : 00:03:09)

etc., etc. etc.......

While it didn't index my 400.php, 401.php, 403.php, and 500.php in my errors folder, it did index my 404.php error page. Now, is there something wrong with the syntax of my robots.txt page for it to index that error page and somehow not index the others. Thanks everyone for your time. Ciao for now!
__________________
Visit the Destroyer X Network at http://www.destroyerx.net/
Destroyer X is offline   Reply With Quote