PDA

View Full Version : index only .shtml


Nosmada
12-23-2003, 10:53 AM
How do you index only .shtml and avoid all other file types such as .txt or .html or .htm...

Charter
12-23-2003, 11:08 AM
Hi. Try adding the unwanted file extensions to the FORBIDDEN_EXTENSIONS in the config file.

Nosmada
12-23-2003, 11:27 AM
Thanks Charter. Almost there! I have FrontPage extensions installed and have many folders in many directories that start with an underscore (_folder). What syntax do I use in the robots.txt to exclude all folders that start with an underscore?

Would I use something like this?

Disallow: /_

or maybe

Disallow: /_*

or something else???

Charter
12-23-2003, 04:36 PM
Hi. If you've already crawled directories that start with an underscore, you can delete them from the admin panel by clicking the site, the update button, and then the appropriate red X symbol.

To exclude all folders that start with an underscore, the robots.txt file can be like so (wildcards are nonstandard):

User-agent: PhpDig
Disallow: /_