Archive for October, 2006

The Other Canned Meat

Back in June (when this site was still running on Blosxom), I disabled comments because of the amount of comment spam I was receiving. Over the course of several years I built a number of anti-spam measures into my Blosxom install, but without a moderation feature, I could only respond to spam that got past my tests by cleaning it up after the fact. Eventually, it was just too much.

When I switched to WordPress in late August, I reenabled comments, because WordPress offers comment moderation. I even added the following note to my comment form:

Hey Spammers! Comments are Moderated, so please save us both some time and move along. No matter how many you send, they’ll never show up. Really.

To the best of my knowledge, not a single spam has made it through- but not for a lack of trying. On an average day, I probably moderate 50 or so spam attempts. On an average day, I probably get <1 real comment.

Nearly three years ago, Mark Pilgrim posted a warning to those trying to develop anti-spam measures. Some folks criticized him for being too pessimistic, but he was dead on. From that post:

Spammers are smart and determined, and people are numerous and stupid, and spam pays. You can’t make it not pay. Going after their ISPs won’t help; they’ll auto-register somewhere else. (Already happening.) Going after their upstream provider won’t help; they’ll cut deals with the backbone providers and keep going. (Already happening.) Going after them in court won’t help; they’re already living under friendly governments. (Already happening.) You can’t stop them with Turing tests; they’ll hire child workers to read your images and manually register/post/ping/trackback/whatever. (Already happening.) Then they’ll attack you with the power of 100 million owned Windows boxes and knock you off the Internet. (Already happening.) They will keep coming and coming and coming until you give up, go home, cry uncle, take Prozac, get a regular day job to replace the one you quit when being an anti-spammer became your full-time job.

And they have kept coming and coming. I often get several copies of the same spam commented on the same post, each using a different method of encoding links: one with html <a> tags, one with wiki-like [brackets], and one with raw link text. Why bothering to figure out what kind of comment system you’re targeting when your automated spam-bot and cover every contingency, shotgun-style?

A couple of days ago, I got one that almost slipped past me. The following comment was left on this post about mounting an OS X partition from Ubuntu:

heyas all. my 40 gig drive is going to good use now. I have installed UBUNTU and have ordered KUBUNTU. I dont know how to install the driver for my ati radeon 9600xt. Actually i dont know if i am meant to be downloading and installing XFREE86 or the XORG version of the driver. I am downloading them both but i dont know how to do anything in Linux really. I dont know where I am meant to set up my modem or set up a net account. (no INETWIZ.EXE) So yeah, can someone help me out with getting my ATI driver installed? and does anyone know of a good long PDF file i can read and wrap my brain around. I’m still a Windows user, but I want to use Linux as much as possible. Thanks. :)

At first glance, it looks like a typical cry for free tech support, only tangentialy related to the topic at hand, as seen on many a help forum and blog comment feed. If it were legitimate, I’d allow it, even though I expect no one to help. But I’ve learned that not every spam contains a link in the body text, so I always check the link to the commenter’s web site. The site address was http://camera-digital.us, which looks like the kind of hokey “every variation on a theme” URLs that spammers and affiliate sites use. Even more curious, the authors email address was for the digital-camera.com domain- similar but different.

So I tried a google search for a line of text from the comment. The first three matches are blog posts with the exact same comment posted. In a couple cases, they actual comment was showing up in a Recent Comments list on the side of the page, but the original post that drew the comment was actually about Ubuntu.

So what’s the point of this post? None exactly- just like Mark’s post above, I don’t have a solution. I only know that the problem is getting worse.

Counting Posts

Daniel at The Web Design Journal emailed me to ask about the post count block that appears in my sidebar. For example:

There are 2 posts in the last 30 days, and 323 total posts.

There are a number of ways to get the total post count. The last 30 days post count is a feature I wanted to add when I switched to WordPress and designed this theme. After searching for an existing solution, I ended up rolling my own. For reasons I don’t recall, I also wrote my own function for getting the total post count. I had meant to post it here, but had forgotten about it.

Instead of designing this as a WordPress plugin, I added it directly to template. In my sidebar template, I added the following:

<p>There are <?php echo get_post_count_inlast(30) ?> posts in the last 30 days, 
    and <?php echo get_post_count() ?> total posts.  
    <a class="sidelink" href="<?php echo get_bloginfo('wpurl')?>/archive/">Archives &raquo;</a></p>

The special bits here are the calls to get_post_count_inlast() and get_post_count(), which are simple PHP functions I wrote to query the WordPress MySQL database. I added these functions to functions.php in the theme directory for my theme. I’m not sure if all WordPress theme include this file, but the default theme, on which I based my theme, already provided this file. In this file, I added the code for the two new functions:

function get_post_count() {
    global $wpdb;
    return $wpdb->get_var('select count(*) from wp_posts where post_status = "publish"');
}

function get_post_count_inlast($days=30) {
    global $wpdb;
    return $wpdb->get_var("select count(*) from wp_posts where post_status = 'publish' and DATEDIFF(CURRENT_DATE, POST_DATE) <= $days");
}

I’m no PHP expert, I’ve picked up just enough to make the tweaks I wanted to my templates, but I believe you could even include the function definitions directly in the sidebar.php file, as long as they’re inside a <?php ... ?> tag.

Hope someone finds this useful. If you decide to add this to your site, please drop me a comment, and be sure to fill in your site URL.

(The source code in this post is released under the GNU GPL (just like WordPress), and comes with no warranty of any kind, express or implied.)