Using xhtml:body in RSS feeds via Blosxom

As I mentioned previously, I’ve recently begun using the full version of NetNewsWire, which honors the use of xhtml:body within an RSS feed. When I upgraded my feed to RSS 2.0 a while ago, I decided to include a forshortend version of the story in the RSS tag, and the full, xhtml content of the story within . Since I had been using NetNewsWire Lite, I could only see the short description.

When I loaded feed into NetNewsWire, I found to my horror that all of the xhtml tags were being escaped within my xhtml:body. I looked at my RSS flavour’s story template, and my html flavour’s story template. Both use $body to include the body of a post. But my html pages don’t have all of the tags escaped.

It turns out that blosxom.cgi is the culprit… it includes logic to escape less-thans and ampersands within storys if the output mime type includes ‘xml’. This isn’t exactly a bug… arbitrary xml (or html) tags within the larger RSS xml document could cause the RSS file not to be well-formed, and so to not parse with standard XML tools. However, since my content is all valid xhtml, and is being enclosed within the feed inside an xhtml:body, there’s no need for this escaping.

I patched my blosxom.cgi in the manner I would expect the change to be made to the official blosxom.cgi… as a configurable option. The changes are as follows.

  1. Add a new config variable
  2. Add it to the vars declaration
  3. Change the code to check this variable.

In more detail:

Step 1: In the ‘Configuration Variables’ section at the top of the file, add a config var.

 #set to 1 if RSS template wraps $body in 
$rssxhtml = 1;

Step 2: Find the ‘use vars’ section right below the configuration varibales. Add $rssxhtml to the list of variables between the !’s.

Step 3: Update the code. Look for this block of code (try searching for ‘xml’:

      if ($contenttype =~ m{\Wxml$}) {
        # Escape <, >, and &, and to produce valid RSS
        my %escape = ('<'=>'<', '>'=>'>', '&'=>'&', '"'=>'"');
        my $escapere  = join '|' => keys %escape;
        $title =~ s/($escapere)/$escape{$1}/g;
        $body =~ s/($escapere)/$escape{$1}/g;
      }

And change the first line to look like this:

      if (!$rssxhtml and $contenttype =~ m{\Wxml$}) {

I'm going to submit this to the list and see if Rael will incorporate it into blosxom.cgi.

You can leave a response, or trackback from your own site.

Leave a Reply