PDA

View Full Version : Rogue Bot Rant


Charter
09-07-2004, 03:28 PM
So here is the first request:

80.181.61.51 - - [07/Sep/2004:04:09:26 -0700] "GET / HTTP/1.1" 200 31891 "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

And after 4 hours, 11 minutes, 13 seconds and 15447 pages, 16717 hits, and 374.30 MB later...

Here is the last request:

80.181.61.51 - - [07/Sep/2004:08:20:39 -0700] "GET /forum/showthread.php?t=942&page=3&pp=15 HTTP/1.1" 200 7583 "-" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"

There is absolutely no reason that the user of this rogue bot had to suck that much content and bandwidth, not to mention that there was no request for a robots.txt file and no bot name was identified. This mofo, yes mofo, and others that do the same, really tick me off!

With that said, PhpDig is scripted to follow a robots.txt file and allow for a five-second delay when indexing. If you are indexing a site that is not your own, please do not bypass these features, as you are using another person's resources and extra bandwidth can cost people money!

Another thing while I'm at it, some people do not know about PhpDig so there may not be a block in their robots.txt file. If you do index a site that is not your own, please be kind and do not index every single page on their site. Such action is simply not necessary to create a decent index.

Don't run a rogue bot. Take the high road, and use PhpDig kindly and wisely.

WebDiva 2.0
09-13-2004, 08:47 PM
That IP address come back to this, Charter:
host51-61.pool80181.interbusiness.it

Don't know if that helps you though, as this would be a proxy IP.