First Bot Ban
Once nice side effect of the recent spam attack I suffered is that it got me poking around in my logs and stats. My hosting provider iPowerWeb offers stats via awstats, which isn’t the greatest but it works. While looking at my stats, I noticed My Most Frequent Visitor had over 3800 page requests this month, while the #2 visitor had less than 500. My Most Frequent Visitor had also sucked down over 90 Meg, while #2 had only around 15 Meg. I became quite interested in My Most Frequent Visitor.
MMFV was identified only by an IP address – 38.144.36.16. Wonder who that is? :
% host 38.144.36.16
16.36.144.38.in-addr.arpa domain name pointer news.allresearch.com
Pluging news.allresearch.com into the browser yielded a refused connection, so I tried www.allresearch.com. Bingo. From the home page:
AllResearch, Inc. was founded in 1998 to provide research, media analysis, and strategic intelligence services for a variety of different markets.
We offer a broad range of products and services to assist various entities with gathering relevant intelligence from the online world. Utilizing cutting-edge proprietary technology, we are able to view and understand the online world in ways never before possible.
Huh. It seems that slogging through my bandwith at 7 times the rate of any other visitor is a proprietary and cutting-edge technology. Who knew? While the marketroid-speak above isn’t perfectly clear, the menu of services certainly brings things into focus, which such items as Webclipping, TrademarkTracker, Online Peer Group Analysis, and Law Enforcement. I’m being stalked by The Man! (and I’m not the only one.)
But why is The Man (aka My Most Frequent Visitor) visiting so much more freqently than everyone else? A grep or two through my access logs reveals all. It seems that once an hour, The Man pulls my RSS feed. Okay, no problem. But then, The Man pulls every one of the posts in my feed. On the one hand, this is stupid because my feed is full content. On the other hand, this is really stupid, wasteful, and hateful because The Man requests the full content of all 10 posts in the feed every hour! Even when the feed hasn’t changed, The Man is re-reading all 10 posts. The Man must have The Bot, even though The Man’s user agent string is "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)". While I bet The Man probably does use IE, I doubt he’s using it once an hour to pull all my posts by hand. Bad, Sneaky The Man!
Far be it from me to criticize The Man, so here ends my tale. On a completely unrelated note, check out the newest addition to my .htaccess file:
RewriteCond %{REMOTE_ADDR} "^38.144.36.16$"
RewriteRule .* - [F,L]
Interestingly, I seem to be seeing a “403 Forbidden” in my logs now, once an hour, every hour, like clockwork.
Both comments and pings are currently closed.
AllResearch is dishonest