Notes on Converting from Blosxom to WordPress

When I switched from Blosxom to WordPress, I had intended to write a HOWTO explaining exactly how to accomplish the task. Unfortunately, this proved to be a real challenge, since no two Blosxoms are exactly the same – there are hundreds of plugins that can change the core behavior in ways both subtle and profound. Instead, I decided to write up some notes on what I did (and why), in hopes that it can help others who want to try and make the switch. Note: I’m using WordPress 2.0.4, if you are on an older (especially pre-2.0) WordPress, your mileage may vary.

After getting a basic WordPress install setup on my webserver, I began by looking for a Blosxom import tool for WordPress. I found Eric Davis’ import-blosxom.php, available from Eric’s software page. (Note: Houran Bosci has a modified version of Eric’s script which I only discovered after the fact, I’ve not tested it, but you may want to have a look at the changelog). Eric’s script didn’t come with any docs, so I tried dropping it into my wp-admin/import folder. I then went to the Import section of my WordPress admin page, expecting to find a new option. Instead, the Blosxom import script’s output ( a series of instructions ) were oddly interspersed with the normal Import screen. I eventually found that the script had to be put directly in my wp-admin folder, and I had to go directly to that page with my browser.

Once you load import-blosxom.php in your browser, you get a page full of instructions. These instructions include the text for a Blosxom theme file, which creates a special RSS feed for your Blosxom site. It uses the extension .rss20 instead of .rss, so it shouldn’t conflict with your existing feed. You then fetch a copy of this feed, and the importer reads the feed to load your new WordPress blog with your old content. Sounds simple, right? Yeah, I thought so too.

First, a few notes about setting up the special RSS feed. By default, it needs the theme plugin, which I didn’t use. You can break the theme up into multiple flavour files, but I found it easier to just install the theme plugin. Drop and go, as I recall. The theme plugin requires the interpolate_fancy plugin, which I did use, and the filesystem plugin, which I didn’t, but that was also a quick install.

Now, by default, Blosxom doesn’t show all posts, only the most recent (by default, 10). This limit is applied for any theme/flavour, including feeds. The number can be configured via the $num_entries variable in the Blosxom script. I played around briefly with trying to use the config plugin to allow me to change the number of entries for only this feed- I didn’t want anyone visiting the site or fetching the feed to get all my entries accidently. Unfortunately, config could not do what I wanted (plugins are just loaded too late, I believe), so I cheated and made a copy of my blosxom.cgi, changed $num_entries, and used that one to fetch the special feed. I also had to disable my moreentries plugin, although I can’t remember exactly what it was breaking.

Now, the importer doesn’t fetch your new feed directly, you have to fetch and save a copy (curl or wget are good for this), copy the file to your server, and edit the import script to point to this file. This is for safety, to make sure you only import what you want to import. My first attempt revealed a few deficiencies in the rss feed and the importer.

First of all, I used Markdown to author my posts in Blosxom, and intended to keep doing so with WordPress. However, the RSS feed contains the rendered HTML, which is what gets imported into WordPress. This imports your posts just fine, but I wanted to maintain my Markdown formatting in case of future edits. I ended up disabling Markdown long enough to fetch the feed with the original Markdown formatting intact.

The next hurdle was URIs. In an earlier post I discussed some of the steps I went through to ensure all of my old URIs would work in WordPress, but the very first hurdle was preserving the post slug. In WordPress parlance (and I believe, in publishing in general), the slug is a short name for an article. Specifically for WordPress, the slug is the ‘file name’ of the post URI. The original import-blosxom.php created a slug for each post based on the title, similar to WordPress’s default mechanism for creating slugs. However, in order to keep my old URIs working with a little mod_rewrite magic, I needed the slugs to match the original filenames used to store the Blosxom posts. I hacked this in as follows:

  1. I modified the rss20 theme file to include the original file name, by adding a line to the <item> section:

    <slug>$fn</slug>
    
  2. I re-fetched the feed to pick up the change.

  3. I edited import-blosxom.php to use the slug. I replaced this line:

    $post_name = sanitize_title($title);
    

    with this:

    $slug = ''
    preg_match('|<slug>(.*?)</slug>|is', $post, $slug);
    $post_name = $slug[1];
    

Now, my after import, my WordPress slugs matched my Blosxom post names, making support of old URIs much simpler (see The Permalink Problem for more). But I wasn’t done yet.

Now, this is going to seem petty to some of you, and that’s fine. But it bothered my that the WordPress post numbers of my import posts were backwards. The most recent imported post was post 1, and the oldest imported post was post 300-and-something. Even though you should never see the post number since I use fancy URIs, I bugged me. So, I fixed it. It may also be worth noting that though these iterations, I ended up futzing my WordPress DB by hand to remove prior imports and reset the post numbering. Hopefully, if you’re following along at home on your own import, you’ll get this all right the first time.

The problem is, Blosxom renders posts in reverse chronological order, like every other blog. The import script reads the rss file and imports the posts in the order in which they appear in the file (which is to say, reverse chronological). But I wanted my posts numbers to be chronological. Yes, I’m a geek. Came to terms with it years ago. Anyway, remember earlier that I couldn’t get config to change the number of entries for the feed? I was, however, able to use it to install a custom Blosxom sort method, like so:

  1. Config has to run first, so i renamed the installed config plugin to 000config, the standard Blosxom hack for plugin load ordering.

  2. In my Blosxom content directory, I created config.rss20, the theme-specific config file for the rss20 theme:

    package config;
    
    sub sort {
      return sub {
        my($files_ref) = @_;
        return sort { $files_ref->{$a} <=> $files_ref->{$b} } keys %$files_ref;
      }
    };
    
    1;
    
  3. Another fetch-import cycle, and my posts were numbered chronologically.

Almost there. I had my content (including comments), but the comment and posts-per-category counts were all 0. It seems WordPress stores these numbers in the database instead of calculating them on the fly, and the importer didn’t update them. A couple of quick sql statements set everything right:

UPDATE wp_posts p SET comment_count = ( SELECT count( * )
FROM `wp_comments` c
WHERE c.comment_post_id = p.id ) 

UPDATE wp_categories c SET category_count = ( SELECT count( * )
FROM wp_post2cat p
WHERE p.category_id = c.cat_id ) 

That’s everything I have in my notes. Hopefully, there’s enought here to help others with a similar conversion. If you find other conversion issues or have questions about my process, please leave a comment below and I’ll try to help

Both comments and pings are currently closed.

6 Responses to “Notes on Converting from Blosxom to WordPress”

  1. douglas.nerad » Old Content Now Imported! Says:

    [...] My friend Jason had a similar initial setup; an old Blosxom blog and a new WordPress blog. Being a much more tech-savvy netizen than I, he figured out all the tricks for migrating and at long last posted up how he did it. What follows is a personal note to him… Jason, you so absolutely rock! I had been trying this and trying this over and over again, with the same script, to no avail. At one point I actually did get it to import a single category, but I saw the comments still said zero and assumed the comments weren’t brought over. I gave up. [...]

  2. OOKEE.com » Blog Archive » Archived Posts Migrated Says:

    [...] Thanks to an excellent post from Jason Clark I’ve finally imported all the posts from the old site to this one. Cool! [...]

  3. azza-bazoo Says:

    I should disclaim that my modified version of the import script was written for WordPress 1.5.2, I have no idea if it works for version 2 (though I suspect it does). If you had no problems with wrong timestamps, my version is probably of little use.

    But these are nice instructions! And your tweak to make slugs work is a great idea, much better than giving in and hacking up a redirect script like I did :-)

  4. smr’s blog » Blog Archive » 24 hrs online..0 hrs available Says:

    [...] Fast Forward of last week * Shiben’s bday, nightout, and bfast at Lingampally. blog.skp coverage. * Rocky postponed his marriage. Cool. * Many nightouts this week. * Spent some time on improving look of KW. * Its raining man. I’m fking gay. * Haircut. Hairs were long again after long time. * Finally imported posts from blosxom. Many thanks to Eric and Jason for making it so easy. [...]

  5. Migrating old Blosxom content to WordPress Says:

    [...] 5th, 2007 · No Comments There seems to be no out-of the box solution for moving a site with several levels ofsubcategories from Blosxom to WordPress and keeping the subcategories intact. Neither Jason Clark’s, nor hohndel.org’s suggestions worked for me without a lot of work. I’ve had no luck with the WordPress wiki or the standard ways of importing from Blosxom to WordPress. [...]

  6. Stephen Laniel’s Unspecified Bunker » Blosxom to WordPress Says:

    [...] If you’re curious, I used Jason Clark’s Blosxom-to-Wordpress instructions, and those on the Graceful Exits blog. After all the various edits that all the pages suggested, I used import-blosxom.php. All the flavour files (Blosxom’s spelling of “flavour,” by the way, not mine) I used are in a separate directory. The import also requires that you install the Blosxom interpolate_fancy plugin, which I’ve set aside for you. [...]