Thread: Rogue Bot Rant
View Single Post
Old 09-07-2004, 02:28 PM   #1
Charter
Head Mole
 
Charter's Avatar
 
Join Date: May 2003
Posts: 2,539
Rogue Bot Rant

So here is the first request:

80.181.61.51 - - [07/Sep/2004:04:09:26 -0700] "GET / HTTP/1.1" 200 31891 "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

And after 4 hours, 11 minutes, 13 seconds and 15447 pages, 16717 hits, and 374.30 MB later...

Here is the last request:

80.181.61.51 - - [07/Sep/2004:08:20:39 -0700] "GET /forum/showthread.php?t=942&page=3&pp=15 HTTP/1.1" 200 7583 "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

There is absolutely no reason that the user of this rogue bot had to suck that much content and bandwidth, not to mention that there was no request for a robots.txt file and no bot name was identified. This mofo, yes mofo, and others that do the same, really tick me off!

With that said, PhpDig is scripted to follow a robots.txt file and allow for a five-second delay when indexing. If you are indexing a site that is not your own, please do not bypass these features, as you are using another person's resources and extra bandwidth can cost people money!

Another thing while I'm at it, some people do not know about PhpDig so there may not be a block in their robots.txt file. If you do index a site that is not your own, please be kind and do not index every single page on their site. Such action is simply not necessary to create a decent index.

Don't run a rogue bot. Take the high road, and use PhpDig kindly and wisely.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension.
Charter is offline   Reply With Quote