Thread: robots.txt
View Single Post
Old 12-05-2003, 05:01 AM   #3
renehaentjens
Orange Mole
 
Join Date: Nov 2003
Posts: 69
I've taken this example from the quoted source, fr. Anonymus. In my opinion it shows that it should be possible (sorry for the lost alignment):

# /robots.txt for http://www.fict.org/
# comments to webmaster@fict.org

User-agent: unhipbot
Disallow: /

User-agent: webcrawler
User-agent: excite
Disallow:

User-agent: *
Disallow: /org/plans.html
Allow: /org/
Allow: /serv
Allow: /~mak
Disallow: /

The following matrix shows which robots are allowed to access URLs:

unhipbot webcrawler-excite other

http://www.fict.org/ No Yes No
http://www.fict.org/index.html No Yes No
http://www.fict.org/robots.txt Yes Yes Yes
http://www.fict.org/server.html No Yes Yes
http://www.fict.org/services/fast.html No Yes Yes
http://www.fict.org/services/slow.html No Yes Yes
http://www.fict.org/orgo.gif No Yes No
http://www.fict.org/org/about.html No Yes Yes
http://www.fict.org/org/plans.html No Yes No
http://www.fict.org/%7Ejim/jim.html No Yes No
http://www.fict.org/%7Emak/mak.html No Yes Yes
__________________
René Haentjens, Ghent University
renehaentjens is offline   Reply With Quote