I’ve been getting swamped by referer spam lately. Most of it is for domains that appear to have had hosting suspended. I had already seen a correlation between this referer spam and comment spam; what I didn’t know (but should have guessed) is that lots of folks are seing this. Tim Bray wrote about the problem today, pointing to more info from John Sinteur and Ann Elisabeth. Apparently all of these referer URIs resolve to a single webhost, with an IP of 161.58.59.8.
Ann’s post is one of many on her blog about the subject, she is actively pursuing this and trying to get Verio to pull the miscreant’s hosting. I suggest reading everything on her homepage for lots of good info.
John’s post on the WordPress support blog includes PHP code that sends a 301 Moved Permanently redirection header to any request with a referer URI that resolves to the IP above. Where does he redirect them? Back to the referer URI, of course.
Now, that’s an idea I like. I liked it so much I wrote a perl version as a Blosxom plugin. It’s called deferer, and is available here. Right now, it’s a one-trick pony, but (when time permits) I intend to expand it to a more comprehensive referer spam solution. I don’t know how effective redirecting these requests is- the spider scripts sending the requests may not follow redirects. Even so, deferer reduces server load (it ends the blosxom invocation early) and saves bandwidth.
Update: Deferer has been updated to version 0+2i, to fix a bug that caused a 500 Server Error if the referer hostname could not be resolved.
Steve Schwartz has created an updated version of my moreentries plugin that adds a series of links for each additional page of posts, like at the bottom of the page on Google. It supports both text and image links. I think this is just fantastic… this is a feature I’ve had requested, and just never got around to implementing. Go check it out!
Version 0+2i of my Storystate plugin is available here. For full details on the uses for this plugin, see the original writeup. This version adds a single new variable, $storystate::rqst_category. This is similar in use to the standard $path variable, except that it’s built from the request URL and not the path of a given post, so it’s available in the header flavour file. This allows you to display the current category in directory style (i.e., WebDev/Blosxom/Plugins) in the header. Since it’s really only useful on category index pages, you can mix it with Rael’s interpolate_fancy plugin to do something like:
<?$storystate::category>
<h2>Viewing Post Category: WebDev/Blosxom/plugins/storystate</h2>
</?>
For display purposes, the breadcrumbs plugin is probably a better choice. I added this primarily for use with per-category RSS feed autodiscovery (link to come).
In a move calculated to occupy all my free time, and the free time of Blosxom afficianados everywhere, Rael Dornfest today announced the availability of Blosxom 3.0+1i (aka 3.0 alpha). As Rael put it:
It’s been massively refactored, all but rewritten, object-oriented, and usable as a CGI script, module, or indeed subclassed. Oh, and I’m afraid it’s grown a bit, now weighing in at a massive 15K (slightly less, actually) ;-)
Once I’ve had a chance to play with, I’ll post some thoughts.
Recently, I updated my Blosxom template to provide <meta name="robots" value="..." /> tags on each of my pages. The idea is to set the value to "index,follow" for permalinks and to "noindex,follow" for all other pages (i.e., index pages) to prevent category and date archives from being returned by search engines.
It has been bothering me for several days that the side affect of this change is that my blog homepage, http://jclark.org/weblog/ is now marked noindex. This is particularly an issue for me since I normally use the blog homepage URL when posting on other sites, etc. Tonight a comment to that post from Lou Quillio got me to do some checking. It appears that Google Searches for my name that used to return the blog homepage in the top 10 hits no longer do so.
In order to undo this damage, I decided to serve "index,follow" for the blog homepage as well as for permalinks. In order to make this change, I’ve updated my head.html template:
<head>
<!-- other head stuff like title omitted for brevity -->
<?$storystate::blogroot>
<meta name="robots" content="index,follow" />
</?>
<?$storystate::permalink>
<meta name="robots" content="index,follow" />
</?>
<?$storystate::archive>
<meta name="robots" content="noindex,follow" />
</?>
<?$storystate::category>
<meta name="robots" content="noindex,follow" />
</?>
</head>
As before, this requires Rael Dornfest’s interpolate_fancy plugin and my own storystate plugin. It’s a bit cumbersome, but it works. Now I just have to wait and see if it fixes my Google juice.
Update: Re-enabling the indexing of jclark.org/weblog seems to have done the trick. It’s once again the number 2 result for “Jason Clark” and the number three hit for “jclark”.
Dugh mentioned in the comments of my post on using the “robots” meta tag with Blosxom that he was having some trouble with interpolate_fancy getting everything working, and asked for my template. Since my comments system is still nearly 100% feature-free, I decided to just create a new post.
The goal is to help search engine robots to only index permalinks (individual posts) but not pages containing multiple posts such as data or category archives (or your main page, since the content changes). To do this, we need to add a <meta /> tag to the <head> of each page. For index pages, we need the following:
and for individual posts, we need this:
My solution requires two plugins for Blosxom: Rael’s interpolate_fancy plugin and my own storystate plugin. A word of warning – if you aren’t already using interpolate_fancy, this isn’t a simple drop-in – you’ll have to change all of your templates. For more info, see the interpolate_fancy docs. My storystate plugin simply provides a number of additional variables denoting the state of the current story, for use by interpolate_fancy’s conditional tags. For this application, we need $storystate::permalink, which is true if the current page represents a single post, and undef otherwise. Here’s the relevant section of my head.html flavour template:
$storystate::permalink>
?>
!$storystate::permalink>
?>
That’s all I did. I haven’t seen much change on Google yet, but it’s only been in place for 3 days. Hopefully, as Google reindexes more of my site, all of my index pages will drop off, leaving only permalinks.
Update: The change shown here has a nasty side effect of no longer indexing the blog’s homepage, see Tweaking the Robot Tweak for an improved version that fixes this bug.
As I’ve mentioned previously, I’ve recently started using Markdown to format my blog entries. After using it for a few posts, I honestly think it’s lowered the “friction” it takes to compose an entry. When I first saw it, my immediate reaction was, “Why? We already have Textile.” However, Textile is a tool to “mark up” plain text to be formatted as (x)html, where Markdown is a tool to render plain text as (x)html. In other words, Markdown strives to remove the need to insert markup; instead it uses existing plain text idioms, especially from email. It doesn’t have as many features as Textile, but it has all the features I use. It also handles certain things much more easily; I’m not sure I ever figured out how to mark up multi-paragraph blockquotes in Textile.
The Blosxom interface in the current beta releases (1-3) apply markdown formatting to every post. In my previous post on Markdown, I offered a updated version that used the Blosxom meta plugin and a “meta-markup” header to enable Markdown on a per post basis. John Gruber, author of Markdown, stated on the Markdown mailing list that he’d rather make this behavior optional. I agreed, and have produced version 2 of my Markdown-Blosxom patch:
Update 3-25-04: John Gruber has released Beta 4 of Markdown, which incorporates (and improves) these changes. The code patch below is now deprecated; go grab a new (post-beta-3) copy of Markdown. :
#### Blosxom plug-in interface ##########################################
# Change $blosxom_always to 0 to use "meta-markup: markdown" story
# headers to enable Markdown on a per-story basis
my $blosxom_always = 1;
#don't change; auto-detects
my $blosxom_hasmeta;
sub start { 1; }
sub filter {
$blosxom_hasmeta = defined(%meta::);
1;
}
sub story {
my($pkg, $path, $filename, $story_ref, $title_ref, $body_ref) = @_;
if ($blosxom_always or
($blosxom_hasmeta and
defined($meta::markup) and
$meta::markup =~ /^\s*markdown\s*$/i)) {
$$body_ref = Markdown($$body_ref);
}
1;
}
Notes:
* Install is as before. Replace the existing Blosxom plug-in interface with the code above; remember to rename Markdown.pl to Markdown when you put it in your plugins directory.
By default, it will always apply Markdown (as it was originally). Even if the Meta plugin is present, we can’t assume the user wants to use it to control Markdown. If you want to use meta-markup, set $blosxom_always to 0.
I check for the presence of meta instead of assuming it’s avilable. If blosxom_always is 0, but meta plugin is not available, Markdown will never be invoked.
the check for meta occurs in blosxom interface’s filter() method. start() is a poor choice because Markdown could be loaded before meta (and will be by default, which is alphabetical). filter() will run for every plugin after all plugins are loaded.
Updates:
the meta-markup check is now a case insensitive regex.
No longer applying markdown processing to the story title. This was causing <p> tags wrapped around the title, which is not appropriate in places like RSS feeds.
Second Update:
- Astute reader ArC points out that if you are using meta but a post has no
meta-markup in the header, the check of $meta::markup would cause perl to complain. I’ve added a defined($meta::markup) test.
John Gruber has released Markdown, a plain text to (X)HTML language and tool. It is similar in function to Textile, which I’ve been using since I started this blog. Markdown‘s formatting is inspired by plaintext e-mail formatting, and has the added advantage that Markdown-encoded text is, basically, legible; even more so than Textile-encoded text. To really see this in action, look at the Markup-encoded version of the Markup home page. This really sold me on the idea of using Markup here on the blog. At some point I intend to do a rigorous, feature-by-feature comparison of the two at some point; for now I just want to play with Markdown.
Markdown (the tool) is implemented in Perl, and is both a command-line tool and a Movable Type plugin in a single file. I had intended to write a Blosxom plugin for Markdown, however, Markdown.pl is also a Blosxom plugin! Very nice. One caveat – You must rename the file to Markdown in order for Blosxom to recognize it as a plugin.
As written, Markdown-as-Blosxom-plugin processes every entry as Markdown-encoded text. This would require me to convert all of my existing entries, which I’m not looking to do. Instead, I modified Markdown.pl slightly to work like the Blosxom Textile plugin- if the Blosxom story header includes meta-markup: markdown, then Markdown is invoked to process the story text (requires the meta plugin). The modified story() looks like this:
sub story {
my($pkg, $path, $filename, $storyref, $titleref, $bodyref) = @;
if ($meta::markup eq 'markdown') {
$$title_ref = Markdown($$title_ref);
$$body_ref = Markdown($$body_ref);
}
1;
}
So how does it work? You’re soaking in it. This entry is written Markdown. So far, so good. I’ll post again later on my Markdown-vs-Textile impressions.
Update: The code above is obsolete. A much more robust version is here.
Blosxom has a catchphrase, “The Zen of Blogging.” I’m not feeling very Zen-like at the moment.
Tonight, I helped a friend set up a new Blosxom install. He has been using Greymatter, but he’s had a few problems with it. I’ve been trying to convince him to use Blosxom, so he decided to give it a whirl. Installed the basic blosxom.cgi.
Now in his case, are bare-bones install is pretty close to exactly what he needs. He displays his blog inside a scrolling IFrame as part of a larger site design, so he doesn’t want any fancy flavours/themes. The default .html flavour would work well for him, except he needed multiple author attribution. He also wanted an easy way to post over the web, and he wanted to avoid writing direct HTML.
I assured him this was simple. Just add the author plugin for multiple authors; that requires the meta plugin, but so does wikieditish, which will let you post over the web. And grab Textile2 so you don’t have to write HTML in your posts.
Unfortunately, it didn’t go as smoothly as one would hope. I’ve been using Blosxom for some time, and knew the steps. I’ve also been working on a new project using Blosxom lately, so I figured I was up to speed. Yikes. After an hour and a half, it’s sort of working. Textile is still busted, and I don’t know why. Got authors working, after a half hour of problems, in the end it seems I’d forgotten that meta has to load before authors, so meta has to be renamed. This isn’t really an onerous requirement, and it’s one that I knew of, but it wasn’t actually documented in any of the stuff we downloaded. What a pain.
Also ran into the wikieditish date-preservation bug which is still present (Note to self: keep blogging bug fixes… never know when you’ll need them again). Also, the default .html flavour (that no one uses) is kinda buggy. I ended up making an external verion of it, using the 1993 flavour from the flavour sampler. Still need to send him some improved RSS flavour files; the defaults aren’t that great. I’m sure there were a couple of other gotchas, but I’m tired now and not remembering everything.
All in all, Blosxom is a great, highly extensible tool with a great community around it. But the overall ‘new user experience’ is still too… Zenless. I hope we can improve that.
I’ve created a new plugin for Blosxom, called moreentries. It creates ‘Next’ and ‘Previous’ links when there are more entries than allowed on a page (as determined by the Blosxom config variable $num_entries). This doesn’t affect date-style urls, since Blosxom ignores $num_entries for date urls.
Please note this is the first version. It’s been tested (with thanks to Fletcher T. Penney), but it may have bugs. Please Note: it probably won’t work right with static rendering, since it will require additional pages to be created. I’ll look into this if there is interest.
You can see it in action here on this site.
Download link: moreentries
Update: Fixed download link.
Update #2: Fixed the other download link (in the first paragraph). D’oh!
Update #3: As I stated in the perldocs, this could be done much more efficiently if blosxom.cgi were updated. Lars has suggested an elegant way of doing just that. If you’d like to patch your blosxom.cgi, check it out. If Rael releases an offical update to blosxom.cgi to include such a change, I’ll release a new version of the plugin to take advantage of it. This version will remain as well, for those using the current Blosxom version.