Hyphenated Forms of Procrastination

  1. Wool-gathering
  2. Yak-shaving
  3. Wikipedia-surfing
  4. Hand-wringing
  5. Blog-posting

Note: The pedantic reader may note that some of these phrases do not traditionally contain a hyphen. The pedantic reader is asked to note this quietly to him- or herself.

The Permalink Problem

The conversion from Blosxom to WordPress began (in my head, at least), over a year ago, when I began to consider switching from categories to tags as a way to reduce the “friction” of writing. Blosxom is good at many things, but its filesystem based storage isn’t well suited for tagging. After exploring several possiblilites, I decided I’d move away from Blosxom; eventually I settled on WordPress.

The biggest hurdle I faced was dealing with permalinks. Blosxom supports two styles of permalinks: date-based and category based. I decided to go with category style permalinks (e.g., /weblog/Apple/macbook.html) when I first started using Blosxom because the URIs are “hackable”; you can whack the end (filename) off the URI and you’ve got the URI for the category. This worked fine when the site was category based, but fails with tags. For this reason, I’ve configured WordPress to use date based URIs (e.g., /weblog/2006/01/01/macbook/); I also decided to drop the “.html” bit since its not necessary.

However, Cool URIs Don’t Change. There are plenty of old links to my site out around the Internets, which use category based-permalinks, that I don’t want to break. Even the internal links within existing posts on this site will continue to use the old URIs, at least until I can get around to cleaning them up. I wanted to keep supporting these existing URIs, so I needed to make sure I can support the old permalinks.

WordPress supports category based permalinks, but with some caveats. If I configure the whole site to use category permalinks, then date-based URIs don’t work. If I configure the site to use date-based permalinks, category-based URIs don’t work. After searching for a plugin to help without success, I tried the WordPress forums. When that failed to turn up a solution, I poked around the code looking for a solution.

(A brief aside. If you want to explore the WordPress codebase, I recomment this excellent online cross-reference.)

I ended up creating a simple plugin that uses the “generate_rewrite_rules” action hook to add additional URI rewriting rules to the internal set used by WordPress to resolve each URI. I hope to make it available as a plugin someday when I have time to make it more generic; currently its hardcoded to solve my problem. Here’s the heart of the code, if you want to build your own version:

function add_permalink_style($rewriteobj) {
    $extra_rewrite = $rewriteobj->generate_rewrite_rules('/%category%/%postname%.html', EP_PERMALINK);
    $extra_rewrite = apply_filters('post_rewrite_rules', $extra_rewrite);

    $rewriteobj->rules = array_merge($rewriteobj->rules, $extra_rewrite);

add_action('generate_rewrite_rules', 'add_permalink_style');

To help me test everything, I tossed together a quick and dirty test suite, a simple list of links to test from my browser. The code above allows them all to pass.

Because my old permalinks all included .html at the end, I can use the rewrite rule above to strip it out. I originally was doing that with a mod_rewrite rule in my .htaccess file, but that caused problems for old permalinks to my category archives. Since the mod_rewrite rule was stripping the .html suffix before wordpress could see the URI, category URIs (e.g. /weblog/Apple/OSX/) and category-style post permalinks (e.g. /weblog/Apple/OSX/howto-install-carbon-emacs/) matched the same pattern, and WordPress thought the first example was for a post named “OSX” in the Apple category. I believe this is the same thing thats causing my tag URIs (e.g., /weblog/tag/wordpress/) to fail at the moment; the only workaround I’ve found requires a mod_rewrite RewriteRule, but only works with a browser redirect. I’m still working on this problem, as I really don’t want to use redirects. Also, for reasons I haven’t quite figured out, the above makes category permalinks work (and without a category URI prefix, see below), even though the rule expects a post name (slug) with .html appended- but I’m not going to argue with success.

Another issue I ran into was WordPress always wanting a prefix in front of category archive links. For example, my Apple category’s archive link (by default) was /weblog/category/Apple/. Even though my plugin’s extra rewrite rule seems to make URIs like /weblog/Apple/ work, I wanted the catgeory links on my Archive page to omit the extra prefix, for consistancy. WordPress allows you to change, but not omit, this prefix. I came up with a cheap hack for that problem; since I could already handle the URIs without the prefix, I only needed to change the URIs generated on the archive page via WordPress’ wp_list_cats() function. I was able to do this in my plugin by hooking into the list_cats filter:

function fix_category_links($content) {
    $content = str_replace('/category/', '/', $content);
    return $content;

add_filter('list_cats', 'fix_category_links');

It’s a hack, but it’s a hack that works.

At this time, I believe all the old permalinks from my Blosxom blog (both posts and categories) should work, as well as the new date-based permalinks. A key part of all of this was to import my existing Blosxom content while preserving the filename (slug in WordPress parlance), I’ll cover this and rest of the import process in a subsequent post. If you should find any links that don’t work, especially from external sources, please let me know. Now if I can just fix those tag URIs….