Archive for the 'Blosxom' Category

Note: I've reorganized this site to use tags; the category archive remains to support old links. Only posts prior to April, 2006 are categorized. Tag Archive »

Fighting Referer Spam with deferer

I’ve been getting swamped by referer spam lately. Most of it is for domains that appear to have had hosting suspended. I had already seen a correlation between this referer spam and comment spam; what I didn’t know (but should have guessed) is that lots of folks are seing this. Tim Bray wrote about the problem today, pointing to more info from John Sinteur and Ann Elisabeth. Apparently all of these referer URIs resolve to a single webhost, with an IP of 161.58.59.8.

Ann’s post is one of many on her blog about the subject, she is actively pursuing this and trying to get Verio to pull the miscreant’s hosting. I suggest reading everything on her homepage for lots of good info.

John’s post on the WordPress support blog includes PHP code that sends a 301 Moved Permanently redirection header to any request with a referer URI that resolves to the IP above. Where does he redirect them? Back to the referer URI, of course.

Now, that’s an idea I like. I liked it so much I wrote a perl version as a Blosxom plugin. It’s called deferer, and is available here. Right now, it’s a one-trick pony, but (when time permits) I intend to expand it to a more comprehensive referer spam solution. I don’t know how effective redirecting these requests is- the spider scripts sending the requests may not follow redirects. Even so, deferer reduces server load (it ends the blosxom invocation early) and saves bandwidth.

Update: Deferer has been updated to version 0+2i, to fix a bug that caused a 500 Server Error if the referer hostname could not be resolved.

Morentries Plugin Update

Steve Schwartz has created an updated version of my moreentries plugin that adds a series of links for each additional page of posts, like at the bottom of the page on Google. It supports both text and image links. I think this is just fantastic… this is a feature I’ve had requested, and just never got around to implementing. Go check it out!

Storystate 0+2i

Version 0+2i of my Storystate plugin is available here. For full details on the uses for this plugin, see the original writeup. This version adds a single new variable, $storystate::rqst_category. This is similar in use to the standard $path variable, except that it’s built from the request URL and not the path of a given post, so it’s available in the header flavour file. This allows you to display the current category in directory style (i.e., WebDev/Blosxom/Plugins) in the header. Since it’s really only useful on category index pages, you can mix it with Rael’s interpolate_fancy plugin to do something like:

<?$storystate::category>
    <h2>Viewing Post Category: WebDev/Blosxom/plugins/storystate</h2>
</?>

For display purposes, the breadcrumbs plugin is probably a better choice. I added this primarily for use with per-category RSS feed autodiscovery (link to come).

The Future is Now

In a move calculated to occupy all my free time, and the free time of Blosxom afficianados everywhere, Rael Dornfest today announced the availability of Blosxom 3.0+1i (aka 3.0 alpha). As Rael put it:

It’s been massively refactored, all but rewritten, object-oriented, and usable as a CGI script, module, or indeed subclassed. Oh, and I’m afraid it’s grown a bit, now weighing in at a massive 15K (slightly less, actually) ;-)

Once I’ve had a chance to play with, I’ll post some thoughts.

Tweaking the Robot Tweak

Recently, I updated my Blosxom template to provide <meta name="robots" value="..." /> tags on each of my pages. The idea is to set the value to "index,follow" for permalinks and to "noindex,follow" for all other pages (i.e., index pages) to prevent category and date archives from being returned by search engines.

It has been bothering me for several days that the side affect of this change is that my blog homepage, http://jclark.org/weblog/ is now marked noindex. This is particularly an issue for me since I normally use the blog homepage URL when posting on other sites, etc. Tonight a comment to that post from Lou Quillio got me to do some checking. It appears that Google Searches for my name that used to return the blog homepage in the top 10 hits no longer do so.

In order to undo this damage, I decided to serve "index,follow" for the blog homepage as well as for permalinks. In order to make this change, I’ve updated my head.html template:

<head>
  <!-- other head stuff like title omitted for brevity -->
  <?$storystate::blogroot>
    <meta name="robots" content="index,follow" />
  </?>
  <?$storystate::permalink>
    <meta name="robots" content="index,follow" />
  </?>
  <?$storystate::archive>
    <meta name="robots" content="noindex,follow" />
  </?>
  <?$storystate::category>
    <meta name="robots" content="noindex,follow" />
  </?>
</head>

As before, this requires Rael Dornfest’s interpolate_fancy plugin and my own storystate plugin. It’s a bit cumbersome, but it works. Now I just have to wait and see if it fixes my Google juice.

Update: Re-enabling the indexing of jclark.org/weblog seems to have done the trick. It’s once again the number 2 result for “Jason Clark” and the number three hit for “jclark”.