Archive for January, 2004

Links, I Think

Welcome to the inaugural post of my newest category: CoolStuff/Potpourri. This category will host short lists of incongruous items. Mostly link-dumpage, but also perhaps random thoughts, curious facts, and whatever else comes to mind. Possibly a precursor to a side-blog, if this category gets used too much. Without further ado….

  • MattMcHugh.com has some wonderful, original short-story fiction. Reminds me of a resolution I made (Personal, #4… as yet unfulfilled).
  • un:parsed is a new Blosxom blog, and a stunning demonstration of the power of simplicity. A very elegant, minimalist design. Yet another nail in the coffin of my current site layout. :)

Hot and Cold

It’s been very cold here on the east coast for weeks. There’s several inches of snow still on the ground. I have had a nice coat of ice on my car’s windshield every morning for days. Having misplaced lost my window scraper, I am reduced to waiting for the car to warm up and melt the ice. This morning, the ice layer was thin. I decided to wait in the car while the ice melted. Five minutes later, I was very cold, but ready to leave.

This is when events began to conspire against me. While my driveway was clear of snow, the road at the end of my driveway still had some snow from the plows, which I hadn’t moved. My plan was to roll through at speed, which generally works. As my front wheels were passing the gutter, I realized I’d turned my car the wrong way… towards a neighbor’s car which is always parked on the road, strategically in my blind spot when I back out. Instinctively, I slowed the car for an extended crane-the-neck look for the other car- forgetting that this was a bad time to slow down. I stopped with my front wheels (of my front-wheel drive Eclipse) directly in the snow, in the gutter. Got my bearings, tried to back up… spun my wheels. Same going foward. Stuck. What a way to start the day!

The back of my car was blocking half the street, on a busy intersection. I was not pleased. I ran to get the snow shovel and rock salt, and asked Sherri to come help. I moved some snow, tossed some salt, and tried again. No change. Sherri tried to push. No change. I tried again… and slipped sideways. Slightly less road blocked, but I was still stuck. Watching the school bus thread its way between my car and the neighbor’s car (the one that caused all this) was especially invigorating.

I am ashamed to say I started getting mad. I was angry that this had happened to me. I was already imagining waiting for AAA to arrive, standing in the snow, waving people around my car. I was livid. I watched a big truck coming up the adjoining street, and was pleased when he didn’t turn. Then his brakes came on, and he started backing up. “Great! You would decide to turn after all, wouldn’t you?” I thought. I was hot.

And then something happened that immediately defused my tantrum. The truck stopped, and two guys hopped out. They walked over, moved a little more snow with the shovel, and pushed me out. I suddenly felt very small for being so worked up over nothing. I also realized it’s a shame that having someone stop to help you is a surprising thing these days.

To the anonymous gentlemen who helped me this morning, thank you and God bless you. You were the answer to a prayer I was too angry to pray. You not only saved me a lot of time and frustration, you improved my outlook for the entire day.

What a good way to start the day.

Better Amazon Links

Some time ago, I signed up for Amazon’s Associates Program. I doubt I’ll ever sell anything, but it seemed like A Thing To Do at the time. I do occaisionally write about books that I’ve read; I try to provide Amazon links as well. Might as well be prepared on the off chance someone buys something on my recommendation.

One thing that I disliked was the linking process. Most of the tools Amazon provides for creating “associate” links include pictures, html, or both. All I want is simple, text-only inline links to use in the occaisional blog post. With some looking, I found a tool to create a text-only link; and the link structure is simple enough to adapt from a regular Amazon link, once you know the layout. Here’s an example link to one of my favorite web books, Zeldman’s Designing with Web Standards. The link looks like a standard Amazon book link, with my associate ID appended. If you follow the link however, you don’t get the product page for the book, you get a “You clicked this, here’s some similar items” page. This has always irked me. Just let me send people to the product I want to show them.

Today while reading something at Zeldman’s site, I clicked a link to a book, and got the Amazon product page. A look at the link url confirmed that his associate id was present, but also included something extra. Here’s another link to the same book, only this one goes straight to the product page. Just add ref=nosim to the ‘path’ after the ASIN number and before the associate ID.

It’s possible that you can do this directly with Amazon’s current tools, but I’ve never figured out how. I always learn something when I visit Zeldman’s site.

Anatomy of a Bug

A couple of weeks ago, I realized that The Colonel Sanders Conspiracy, one of the blogs I subscribe to via RSS, had not had any new articles in several weeks. I decided to visit the site via my browser, and the was surprised to find several weeks’ worth of articles I had never seen. I sent a quick note to Joshua Hoover, author of the blog. Over the course of several emails, we worked out the problem, and Josh was able to correct it. He kindly agreed to let me document the process. I’ve outlined the steps we followed below, as a case study in debugging. ( Warning: a bit long. )

When I realized that the website had new articles not in the feed, I first thought to check the rss feed. I retrieved the feed url I’m currently subscribed to from my aggregator (Bloglines.com) and retrieved it directly with my browser. Notice that I made sure to use the same feed url that was producing the bad results. Clicking a feed url on the site might have pulled a different feed, for example if his feed had moved. The document retrieved was blank, and viewing source revealed an empty html document, consisting of only html and body tags. Only later did I figure out that Firebird displays this when viewing source for any empty document.

Having confirmed the rss feed I was using was empty, I then checked the RSS link on the the site. They were the same. At this point, I felt it was time to submit a bug report. Using the feedback link on the site, I sent Josh a quick note, explaining what I’d seen, and including the URL of the feed I was using… just incase he had a new feed URL he had forgotten to publish.

A little while later, I got an email from Josh – and it was a great example of how to debug. He noted that he had checked the feed, which seemed ok, and provided links to two feed validators (Feedvalidator.org and Userland’s validator)… in fact, he sent direct links to validate his feed. Had I not seen an empty doc, I probably would have validated. I should have done it anyway. Josh also asked what reader I was using. Asking the right questions is half the battle when debugging.

At this point, my spider-senses were tingling. Why would the validators give it pass? I took another look, and it was still empty. At first I thought an empty doc might be a “valid” feed, but then I got curious and tried pulling it from the command line via curl. I got the feed, not an empty doc. Is that plot-thickening I smell?

So what did I know at this point? Both my browser (Firebird) and my aggregator (Bloglines.com) were not getting the feed. Two web-based validators and a command-line retrieval tool were getting the feed. This seemed to indicate that the webserver was sending back different responses based on some criteria. My first thought was browser sniffing – checking the HTTP ‘User-Agent’ header to determine the browser type. A great tool for investigating HTTP headers is Daniel Savard’s Live HTTP Headers plugin for Mozilla and Firebird. With the plugin running, I again requested the rss feed, and reviewed the request. Curl provides an option to fake the User-Agent, so I copied Firebird’s User-Agent string and tried again. I still got the full RSS document. I also tried faking the HTTP ‘Accept’ header, which allows a user agent to tell the browser what document formats it can accept, by coping it from Firebird. Again, curl succeed where Firebird failed.

I was running out of ideas, so I tried switching tools. I fired up Safari and retrieved the feed URL again. Safari didn’t give me an empty page… it gave me an error (and a request to submit a bug report):

Could not open the page “http://jhoover.weblogdog.com/xml-rss2.php”
The error was: “bad server response”
(NSURLErrorDomain:-1011)

This was interesting… if Mozilla had reported an error, I might have saved some time assuming the document was empty. This is also a good debugging tip- don’t assume your tools are perfect. Infact, Firebird was one of the products showing the original buggy behavior; another reason to cross-check with another tool. Grasping for one more straw, I fired up NetNewsWire Lite, the aggreagtor I used before I switched to Bloglines. NNWL retrieved the feed without errors.

Stumped, I decided to drop back 10 and punt. I emailed Josh with everything I had found, and mentioned that I still felt the problem had something to do with HTTP headers. I hoped that something I had learned would trigger an idea for him. And it did. Collaboration is another key tool in debugging. At work, when I ask another developer to help me figure out a bug, more than half the time I figure it out just by trying to describe the issue out loud.

My mention of HTTP headers caused Josh to suggest his use of gzip compression on his site as a possible culprit (using gzip compression allows a webserver to compress everything sent from a webserver to a browser, to save bandwith). In his own words:

Could my feed be doing something weird because I use GZIP compression across my site? I know some RSS reader apps on Windows can’t handle GZIP, but I can’t see why a browser couldn’t.

Bingo. When I read this, I was pretty sure Josh had solved the problem. I suspected that the problem was not that some programs (like my browser) were unable to handle gzip, but because they are able to handle it. Not all user agents can handle gzip, so webservers aren’t allowed to assume – they only use gzip if the user agent states up front (in the request) that it can take gzip. If Josh’s gzip were broken, only user agents that asked for gzip (like my browser) would see the problem.

But theories don’t fix bugs. It was time for some testing. I used curl to test various scenarios, the tests I tried are outlined below.

curl http://jhoover.weblogdog.com/xml-rss2.php
Requests the rss, no compression. I saw the feed.
curl -i -H “Accept-Encoding: gzip” http://jhoover.weblogdog.com/xml-rss2.php
Requests rss, asks for gzip. -i makes it print the HTTP response headers. In the headers I saw “Content-Encoding: gzip”, but the rss feed was plain text. Aha! Just to be sure, I next tried a site I know supports gzip:
curl -i -H “Accept-Encoding: gzip” http://diveintomark.org
Requests homepage of diveintomark.org, with gzip (if avail), and prints response headers. In the output, I saw the headers (including “Content-Encoding:gzip”), but the body was binary. This is what I’d expect gzip’d content to look like.
curl -H “Accept-Encoding: gzip” http://diveintomark.org | gunzip –
Requests same page, still in gzip, omits headers from output (I only wanted the gzip’d stuff at this point), and pipes it into gunzip, which unzips gzip content. The output was the html of the page I requested. So now I knew how to successfully retrieve and unencode gzip’d content.
curl -H “Accept-Encoding: gzip” http://jhoover.weblogdog.com/xml-rss2.php | gunzip –
Request the problem feed, with gzip, no headers, pump it through gunzip. This should work if the feed is gzip’d. I had already confirmed that the server claimed it was gzip’ed in the second test above, when I looked at the headers. The output was “gunzip: stdin: not in gzip format”. Looks like we have a smoking gun. If gzip was requested by the User Agent, the feed’s response header indicated the content is gzip’d, but it wasn’t… it was really in plain text. Just to be thorough, I checked Josh’s homepage next.
curl -i -H “Accept-Encoding: gzip” http://jhoover.weblogdog.com
Just like the first diveintomark.org test, the header indicated “Content-Encoding: gzip” and the content was binary. Good. So the final test:
curl -H “Accept-Encoding: gzip” http://jhoover.weblogdog.com | gunzip –
Worked fine… Josh’s home page appeared to be correctly gzip’d. It was only the rss feed that failed. I also did some checking with ‘deflate’ encoding (an alternative to gzip), and got the same results… the feed’s headers indicated deflate encoding was sent, but the content was plain text (uncompressed).

At this point, I had found the bug, but it wasn’t fixed. I sent my findings to Josh (just once I wish one of my users (at work; not you, dear reader) would send me this kind of bug report!). Josh took the correct next step – research. He checked the website for the software he uses (Nucleus), and found that the gzip plugin had recently been updated, with the following changelog info:

NP_Gzip v2.0 (December 28, 2003)
* RSS feeds were broken when this plugin was installed. This was fixed by no longer compressing those.

He installed the updated plugin, and now everything works.

Working on this was alot of fun for me… which isn’t always the case with debugging. Collaborating via email required me to document a process I normally wouldn’t; this got me to thinking about the value of such an endeavor. Hopefully it will be valuable for others as well. As always, comments welcome. And thanks again to Joshua Hoover for allowing me to publish all of this.

Musical Chairs

Recently I wanted to move some MP3s from my work PC to my Powerbook. My Powerbook was not on the network, and I didn’t really want to connect it. I did have a 128M USB thumb drive handy, and decided to move the files in batches. But how best to collect them? I use iTunes on both machines, but iTunes organizes the files into per-artist and per-album directories. Since I was going to making a few trips (multiple batches), I didn’t want to try and copy the directories; once you run out of space, it’s a pain figuring out what was copied and what wasn’t.

Playing a hunch, I fired up iTunes on the Windows PC, and opened the thumb drive in Explorer. All of the songs I wanted to copy were in one playlist, so finding them was easy. I then selected a bunch, and dragged them from iTunes and dropped them in Explorer. iTunes Did The Right Thing and copied files onto the drive. When the drive was full, I pulled it out and plugged it into the Powerbook. I opened the mounted drive in Finder, selected all the MP3s, and dragged them onto the Library icon in iTunes. Again, iTunes copied the files… this time, into the iTunes directory tree.

A few more round-trips, and all my music was moved. With a bigger device, I could have made it in one trip. And everything Just Worked the way you would expect (and want). Very Nice.