Using xhtml:body in RSS feeds via Blosxom

As I mentioned previously, I’ve recently begun using the full version of NetNewsWire, which honors the use of xhtml:body within an RSS feed. When I upgraded my feed to RSS 2.0 a while ago, I decided to include a forshortend version of the story in the RSS tag, and the full, xhtml content of the story within . Since I had been using NetNewsWire Lite, I could only see the short description.

When I loaded feed into NetNewsWire, I found to my horror that all of the xhtml tags were being escaped within my xhtml:body. I looked at my RSS flavour’s story template, and my html flavour’s story template. Both use $body to include the body of a post. But my html pages don’t have all of the tags escaped.

It turns out that blosxom.cgi is the culprit… it includes logic to escape less-thans and ampersands within storys if the output mime type includes ‘xml’. This isn’t exactly a bug… arbitrary xml (or html) tags within the larger RSS xml document could cause the RSS file not to be well-formed, and so to not parse with standard XML tools. However, since my content is all valid xhtml, and is being enclosed within the feed inside an xhtml:body, there’s no need for this escaping.

I patched my blosxom.cgi in the manner I would expect the change to be made to the official blosxom.cgi… as a configurable option. The changes are as follows.

  1. Add a new config variable
  2. Add it to the vars declaration
  3. Change the code to check this variable.

In more detail:

Step 1: In the ‘Configuration Variables’ section at the top of the file, add a config var.

 #set to 1 if RSS template wraps $body in <body>
$rss_xhtml = 1;</body>

Step 2: Find the ‘use vars’ section right below the configuration varibales. Add $rss_xhtml to the list of variables between the !’s.

Step 3: Update the code. Look for this block of code (try searching for ‘xml’:

      if ($content_type =~ m{\Wxml$}) {
        # Escape , and &, and to produce valid RSS
        my %escape = ('''=>'>', '&'=>'&', '"'=>'"');
        my $escape_re  = join '|' => keys %escape;
        $title =~ s/($escape_re)/$escape{$1}/g;
        $body =~ s/($escape_re)/$escape{$1}/g;
      }

And change the first line to look like this:

      if (!$rss_xhtml and $content_type =~ m{\Wxml$}) {

I’m going to submit this to the list and see if Rael will incorporate it into blosxom.cgi.

Both comments and pings are currently closed.

Comments are closed.