Using xhtml:body in RSS feeds via Blosxom
As I mentioned previously, I’ve recently begun using the full version of NetNewsWire, which honors the use of xhtml:body within an RSS feed. When I upgraded my feed to RSS 2.0 a while ago, I decided to include a forshortend version of the story in the RSS tag, and the full, xhtml content of the story within . Since I had been using NetNewsWire Lite, I could only see the short description.
When I loaded feed into NetNewsWire, I found to my horror that all of the xhtml tags were being escaped within my xhtml:body. I looked at my RSS flavour’s story template, and my html flavour’s story template. Both use $body to include the body of a post. But my html pages don’t have all of the tags escaped.
It turns out that blosxom.cgi is the culprit… it includes logic to escape less-thans and ampersands within storys if the output mime type includes ‘xml’. This isn’t exactly a bug… arbitrary xml (or html) tags within the larger RSS xml document could cause the RSS file not to be well-formed, and so to not parse with standard XML tools. However, since my content is all valid xhtml, and is being enclosed within the feed inside an xhtml:body, there’s no need for this escaping.
I patched my blosxom.cgi in the manner I would expect the change to be made to the official blosxom.cgi… as a configurable option. The changes are as follows.
- Add a new config variable
- Add it to the vars declaration
- Change the code to check this variable.
In more detail:
Step 1: In the ‘Configuration Variables’ section at the top of the file, add a config var.
#set to 1 if RSS template wraps $body in <body>
$rss_xhtml = 1;</body>
Step 2: Find the ‘use vars’ section right below the configuration varibales. Add $rss_xhtml to the list of variables between the !’s.
Step 3: Update the code. Look for this block of code (try searching for ‘xml’:
if ($content_type =~ m{\Wxml$}) {
# Escape , and &, and to produce valid RSS
my %escape = ('''=>'>', '&'=>'&', '"'=>'"');
my $escape_re = join '|' => keys %escape;
$title =~ s/($escape_re)/$escape{$1}/g;
$body =~ s/($escape_re)/$escape{$1}/g;
}
And change the first line to look like this:
if (!$rss_xhtml and $content_type =~ m{\Wxml$}) {
I’m going to submit this to the list and see if Rael will incorporate it into blosxom.cgi.
Both comments and pings are currently closed.