Archive for the 'Programming' Category

Note: I've reorganized this site to use tags; the category archive remains to support old links. Only posts prior to April, 2006 are categorized. Tag Archive »

Wisdom of the Documentation

Today, I spent some time staring at an old piece of code that I had written at least a year ago. It’s been in testing several times, but never put into production (the project it is tied to has been bumped on several occaisions). Today, it was back in testing.

The code is a failry simple web service, written in Perl. I like Perl. I have no illusions that I’m a fantastic Perl hacker, but I know the language well, though both experience and reading. I’ve read most of the O’Reilly Perl titles, including Programming Perl (”The Camel”), which I’ve read cover to cover at least three times. I still find myself looking things up, usually to refresh my memory about something I can remember reading, or some syntax detail I can’t get right (one of the perils of working in multiple languages). At least I generally know where to look.

So this web service has been tested before. It works in a browser, and it works when called by my test client. It’s been tested with a third-party bit of code. Today, it was tested by Dave, using some custom client code he had written in C#. And it worked… if he told his http library to ignore HTTP protocol errors. If he didn’t, his library complained.

And so I stared at the code for a while. By coincidence, I’d been reading the HTTP Spec over the weekend (yes, I’m a geek), and was pretty sure my response was good- a bare-minumum response, along the lines of:

HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8

Single Line Response

I double checked the spec anyway, and kept staring at the code. I was about to start grasping at straws and adding additional entity headers to the response (such as Content-Length), when I finally stared at the code long enough. I saw something like this:

print "HTTP/1.1 $status\n"
print "Content-Type: text/plain; charset=utf-8\n"

Then it hit me- "\n" in perl is a “magic newline”- it conforms to the newline convention on the system in question. HTTP, on the other hand, requires ASCII CR+LF (Cariage Return + Line Feed, or "\r\n" in C) as a line terminator. Apparently all of the code thrown at the service before today was a bit forgiving. I changed the strings to send CRLF using octal escape sequences ("\015\012"), and everything was fine. I was a bit ticked about the mistake… I new both the HTTP requirement for CRLF and the Perl treatment of "\n" when I originally wrote the code; it was a dumb mistake. Also aggravating that it took so long to spot.

And there my tale should end. But this evening, I started wondering if the octal sequence was the most Perlish way to send a CRLF. I knew that "\r" is fine for the CR, but you can’t use "\n" for the LF - it’s magic in perl, and behaves differently on different platforms. I began to wonder if Perl has a backslash-escape for LF that is always LF. Eventually, I had to check for myself, so I referred to the Quote and Quote-like Operators section of the perlop man page. (Sadly, I knew right where to look, right down to the name of the section. Geek, remember?)

Turns out the manpage specifically recommends the octal form for networking applications (at least I got that right), but then it twists the knife:

If you get in the habit of using "\n" for networking, you may be burned some day.

D’oh.

HOWTO See what’s changed in the file you’re editing in Emacs

Warning: Geek threshold exceeded. If you don’t know what Emacs is, this post will mean nothing to you.

I use Emacs as my primary editor these days, and I tend to have lots of buffers open at once. Every so often, I’ll go to close Emacs or just close some buffers, only to be alerted “Buffer XYZ modified; kill anyway? (y or n).

What? I opened that buffer 3 days ago. I don’t know if I should save those changes… what changed? Now, my copy of XEmacs has ediff, a very nice interactive diff tool. You can diff two buffers, two files, three files, buffers against revisions (if the file is under source code control), etc. What you can’t do is diff a buffer against the underlying file on the file system.

Now, I could save the buffer in question to a temporary location, and then ediff that against the original. But where’s the elegance in that? I wanted a better solution, so I asked Google.

I found the answer (well, most of it) in a 1996 post to comp.emacs by Larry Rosenberg. He offered a function for doing a quick diff of the current buffer against the underlying file system version. I bound this C-c d in emacs. I then expanded it just a bit. I often use context-diffs (diff -c), so I wanted the option, but for long lists of diffs, sometimes it’s just too much. So, I made it an option of the command. Invoked normally, it shows a plain diff. When prefixed with C-u (using my binding, this becomes C-u C-c d), it runs a context diff.

Just to be clear - Larry did all the real work back in 1996. But considering my understanding of elisp is only slightly better than my understanding of Sandskrit, I’m pleased with my modification.

Here’s the function definition, along with the keybinding, from my emacs init file:

(defun diff-buffer-against-file (context)
    "diff the current [edited] buffer and the file of the same name”
    (interactive “P”)
    (let (  ($file buffer-file-name)
            ($tempFile “/tmp/emacs.diff”)
            ($tempBuffer “emacs.diff”))
        (delete-other-windows)
        (push-mark (point) t)
        (generate-new-buffer $tempFile)
        (copy-to-buffer $tempBuffer (point-min) (point-max))
        (set-buffer $tempBuffer)
        (write-file $tempFile)
        (shell-command (concat (if context “diff -c ” “diff “) $file ” ” $tempFile))
        (kill-buffer $tempFile)
        (pop-mark)
    )
)

(global-set-key “\C-cd” ‘diff-buffer-against-file)

Stupid Perl Debugger Tricks

I’ve been playing with Python alot lately in my spare time, but I still use mostly Perl at work. One of the handy things about Python is the interactive mode; I like it so much I even cloned it for Perl some time ago. Even without my Perlthon script, you can get a quick approximation in perl using the perl debugger and a command-line script:

perl -de1

(That’s a one, not an el.) The above will invoke perl with the debugger (-d), debugging a very simple script (-e1, which is to say -e '1;'). Once the debugger starts, you can just type perl statements, and can use x <expr> to inspect values.

Whichever way you play with interactive Perl, testing regular expressions can be a pain. It’s not too bad under Python, given Python’s use of match objects:

import re
re.search('regex', 'string').group(0)

The second line runs the regex against the string, and prints the entire match. If your regex doesn’t match at all, you get an error, but that’s fine (and self explanatory). If your regex doesn’t perform as expected, repeated attempts make it easy to triangulate. If you have Python’s readline module installed, you can just hit UpArrow after the test, tweak the regex, lather, rinse, repeat. I wanted the same flexibility with interactive Perl; it turns out to be trivial:

# perl debugger version
x ('string' =~ /regex/, $&)[-1]

# perlthon version
(’string’ =~ /regex/, $&)[-1];

The regex match operation will return a list of matched groups, which can be handy at times. For testing a complicated regex, I often just want to see the whole match to be sure I’m getting what I expect. The array notation accomplishes this nicely.

Python unittest Greenbar Without the GUI

Lately I’ve been learning Python (again), and trying out Unit Testing using Python’s unittest module. In many unit testing tools, each time the test suite is run, a progress bar is shown, which is green as long as the tests pass, but turns red when a test fails. Passing your tests is called “getting a greenbar,” and is the mark of success.

I’ve been working on my Powerbook, using the default Python install. I work with two terminal windows open; one running Emacs, and the other to run my tests in. The unittest module provides a command line method of running all tests, with output appearing on the terminal. I wanted to have a greenbar/redbar indicator, but none is present.

There is a GUI version, unittestgui, available which uses tkinter, but doesn’t ship with OS X. Google turned up a Cocoa app for OS X for running Python unit tests, but it hasn’t been updated in a few years, and just crashes on my Powerbook.

Since I prefer to develop in a terminal session anyway, I decided I wanted a way to run my unit tests from the shell, and see the red/green effect right in the terminal. Thus I created utest, a Bash script that runs a unittest, and changes the display text color using ANSI escape codes based on the output of the test. Since unittest prints all testing output to STDERR, I redirect it to STDOUT and pipe the the whole shooting match into perl one-liner that parrots eveything, while setting colors as appropriate. Output starts green, but turns red if the output contains FAIL or ERROR. It also unbuffers standard out, so the results appear correctly.

Here it is in action, when all tests pass:

utest with passed tests

And here it is, with a failing test:

utest with failed test

If you can’t see these images, you’re probably using IE, so don’t come crying to me. You know what to do. If you’re wondering about my tabbed terminal, it’s iTerm.

Hackers and Painters

Via Tim Bray, via Tim Bray, via, well, you know: Go and read Hackers and Painters, an essay by Paul Graham. This is a must-read for anyone who considers themselves a hacker, especially those like me who make a living at it. A few choice quotes:

I tended to just spew out code that was hopelessly broken, and gradually beat it into shape. Debugging, I was taught, was a kind of final pass where you caught typos and oversights. The way I worked, it seemed like programming consisted of debugging.

For a long time I felt bad about this, just as I once felt bad that I didn’t hold my pencil the way they taught me to in elementary school. If I had only looked over at the other makers, the painters or the architects, I would have realized that there was a name for what I was doing: sketching. As far as I can tell, the way they taught me to program in college was all wrong. You should figure out programs as you’re writing them, just as writers and painters and architects do.

And also:

In hacking, like painting, work comes in cycles. Sometimes you get excited about some new project and you want to work sixteen hours a day on it. Other times nothing seems interesting.

Amen, Brother.

Authentication Headers and IIS

After spending more time than I care to admit working on a problem I actually solved in May, I decided I’d better blog this before I forget again.

Using ActivePerl, it is possible to run perl CGI scripts under Microsoft IIS. In a default installation, the extention to use is .plx (not .cgi), which is mapped to run under PerlIS.dll, the “Perl for ISAPI” implementation. Works pretty well. The issue I had was with HTTP Authentication. If you want to handle your own authentication in a CGI script, you can check the Environment variable HTTP_AUTHORIZATION. For example:

binmode(STDOUT, ":utf8");  #you know you should

my $have_authinfo = (defined($ENV{HTTP_AUTHORIZATION}) 
    and (substr($ENV{HTTP_AUTHORIZATION},0,6) eq 'Basic '));

my ($user, $pass) = ('','');
if ($have_authinfo) {
    my $decoded = decode_base64(substr($ENV{HTTP_AUTHORIZATION},6));

    if ($decoded =~ /:/) {
        ($user, $pass) = split(/:/, $decoded);
    } else {
        $have_authinfo = 0;
    }
}

if (!$have_authinfo or !Authorize($user, $pass)) {
    print << "EOF";
HTTP/1.1 401 Authorization Required
WWW-Authenticate: Basic realm="Example.com"
Content-Type: text/plain; charset=utf-8

You must supply valid credentials to access this resource

EOF
    close STDOUT;
    exit 0;
}

#Authorized, continue with web page....

sub Authorize {
    my ($user, $pass) = @_;
    #Do something to authenticate, return true/false
}

Of course, I’m relying on Basic authentication, which you should only do if your script will only be available via HTTPS (as is mine). The whole thing is dependant on $ENV{HTTP_AUTHORIZATION}, which by default won’t actually get passed to your script under IIS and PerlIS.

Fortunately, fixing this is simple, if a bit non-evident. In the IIS Management Console, navigate to the folder containing your script, and select the script. Right-click and choose properties. On the File Security tab of the properties dialog, click the Edit button under “Anonymous Access”. On the next dialog, make sure that “Annonymous Access” is checked and that no other authentication method is checked. By default, Windows Integrated Authentication is selected, which makes IIS snoop around the header, and apparently lose it.

For multiple scripts, put them all in one location (go on, call it cgi-bin), and make the same changes above to the whole folder. New scripts created in this folder should inherit the settings.

Perlthon

I’ve been writing alot more Perl at work lately, which suits me fine. With my Perl still a bit rusty, however, I found myself dashing off lots of -e one-liners to test various bits. After a while, I wanted a quicker way to test things…an interactive mode such as Visual Basic’s immediate window or Python’s interactive mode.

Perl being perl, this sort of thing is crazy easy. As most perl programmers know, or can quickly figure out, the fastest way to get an interactive perl session is: perl -ne eval;

which will eval every line of input until input ends (CTRL+D on *nix systems) or you type exit (as exit is a perl builtin which does just that). This command has no explicit output, you’ll need to include your own print calls to see output. Still… quick, dirty, and handy. A slight tweak if you’ll be printing most everything you feed it: perl -ne ‘print eval; print “\n”‘

This autoprints everything. Note the use of two print statements; this is on purpose. If instead we use print eval() . "\n", the eval will be called in scalar context. Try printing localtime, if you see a nicely formated date, scalar context is to blame. If you want such a thing, just print scalar localtime. Best of both worlds. The other choice for combining print statements is print eval, "\n". This calls eval in list context, however, it also passes "\n" to print as part of the list. This means that using: $,=”,”; localtime;

to print the localtime array with commas will add a comma to the end of the line as well (before the "\n").

Never content to leave well enough alone, however, I wanted more. Command history and in-line editing. Multi-line command entry (a la Python). A help function. A quit function that doesn’t (a la Python). You know, toys. The result is Perlthon, an interactive Perl session that works like Python’s interactive mode. It’s easier to show than to tell, so here’s a sample session:

$> perlthon
Perlthon running Perl 5.8.0
Type "help;" for help, "exit;" or press CTRL+D to exit
>>> help;
Perlthon, the Interactive Perl Interpeter v0.1
by Jason Clark  <jason@jclark.org>
This is Free Software.

Enter commands for perl to evaluate, a la interactive Python.
Lines without a ; are continued on next input.
By default, the result of each evauation is printed.  To disable, 
use this: "$AUTOPRINT=0;"

Use "exit;" or CTRL+D to exit.

Prompts are controlled by $PROMPT1 and $PROMPT2, which you can change.

>>> localtime;
59371729810432721
>>> $, = ",";
,
>>> localtime;
10,38,17,29,8,104,3,272,1
>>> scalar(
...    localtime
... );
Wed Sep 29 17:38:30 2004
>>> quit;
Use "exit;" or CTRL+D to exit.
1
>>> exit;
$>

Bugs: Oh, you betcha. Weird behavior when Term::Readline has to fake it and you’ve changed $,. Also, because Perlthon looks for ; at the end of line to know when it’s time to eval, entering a multiline sub is a pain. You can beat it by ending each line with a comment (#). Of course, this could be considered another bug… semi-coloned lines ending with comments don’t run unless the comment ends with a semicolon. I’d like to have a block mode (if inside {} then ; doesn’t end multiline input), but this is trickier than it looks. Consider: do { #comment with a } foo; }

But hey, works for now.

Howto treat a Perl Scalar like a Filehandle

This morning, for the second time in as many weeks, I banged my head against a Perl problem I thought should be simple: I wanted to open a filehandle against a scalar (or perform some other chicanery) so that if I have a scalar $text full of text, I could do this:

while (<$fh>) {
    #do something with $_
}

Where the filehandle $fh would refer to the contents of the scalar $text. This seems like an obvious thing someone may want to do. While you could just as easily do something like foreach (split /\n/,$text) { #... }, I had a situation where I might have data in a scalar or in a file (or even STDIN) and I wanted to treat them all the same. I expected this would be covered in the Camel and/or the Cookbook, but I couldn’t find any such thing. I didn’t have much luck on the web either. In the end, I took the low road and faked it with a system call and a pipe, since I the script I was working on is infrequently used. Here’s that version:

open(FH, qq[echo "$text" |]) or die “Can’t pipe.”;
while (<fh>) { #… }

When the same problem came up this morning in a web service I’m working on, I decided to try researching it again, since I really didn’t like the pipe solution. After alot of digging, I came across the perl module IO::Scalar, which lets you do exactly what I wanted. The IO:Scalar version of the code looks like this:

use IO::Scalar;
my $fh = new IO::Scalar \$text;
while (<$fh>) { #... }

My test app worked nicely on the Unix development server, but then I realized I still had a problem. For some period of time, this webservice will be running on a Windows2000 web server (don’t ask). IO::Scalar is not part of the standard Perl distro. I’m using ActiveState’s ActivePerl, which makes installing Modules via perl -MCPAN -e shell, well, challenging. ActiveState has a nice Perl Package Manager; unfortunately I could find no ready-made package for IO::Scalar, so I was stuck. While stumbling around the ActiveState site looking for inspiration, I found PerlIO::scalar. Now I was on to something.

(A moment for a side note here. I would write far fewer lines of Perl code per hour if not for the fantastic Perldoc.com maintained by Carlos Ramirez. It is an absolutely indispensible resource for me. However, it’s a bit buggy. I can’t get it to let me search the perl 5.8.0 docs, and the 5.8.4 docs appear incomplete (missing standard modules). After finding the info on ActiveState’s site, I figured out how to get to it on Perldoc.com.)

According the docs for PerlIO::scalar:

PerlIO::scalar only exists to use XSLoader to load C code that provides support for treating a scalar as an “in memory” file.

The docs on the ActiveState site also note that it’s not necessary to use PerlIO::Scalar. This reduces the code to the following:

open $fh, "<", \$text;
while (<$fh>) { #... }

Excellent. Not only is it concise and easy to use, it’s part of the standard distro. I’ve documented this not only for my own future reference, but to add my bit to the world’s largest help database. I just hope I wasn’t being totally obtuse, to later find that this is Camel page 52 material.

Perl Module Tips

This a bit of pre-emptive blogging, for the next time I forget. Those of you with stronger Perl-fu than I already know these things.

How to check if a module is installed:

No output indicates success : perl -MMODULENAME -e1

For example: perl -MHTML::Embperl -e1

How to check the version number of an installed module:

Assumes the module uses $VERSION, but then, most CPAN modules do) : perl -MMODULENAME -e’print “$MODULENAME::VERSION\n”;’

For example: perl -MHTML::Embperl -e’print “$HTML::Embperl::VERSION\n”;’

How to determine where a module is installed:

Lists every dir used by the module, including man pages, etc. Should be entered on one line. :

perl -MExtUtils::Installed 
  -e'$,="\n";print ExtUtils::Installed->new()->directories("MODULENAME")," "'

For example:

perl -MExtUtils::Installed
  -e'$,="\n";print ExtUtils::Installed->new()->directories("HTML::Embperl")," "'

This technique relies on ExtUtils::Installed, which is part of the standard Perl distro these days.

It’s Never too Late to Fix Bugs

Time once again to mention my dirty little secret… yes, I’m a Visual Basic programmer. Of course, I also do alot of Perl, and I’ve done my share of Java, but much of the software I’ve written in the past 10 years has been VB, starting with VB3.0 (I did play with VB 1.0, but not for work). My current major project at work is mostly implemented in VB6.0 (service pack 5) because of the requirement to interact with the Excel and Word application binaries.

Why do I bring this up now? Because Microsoft has released Service Pack 6 for Visual Studio, which includes updates for VB6, VC++ 6, and SourceSafe. I took a look at the list of VB bugs addressed by this update, and found a couple that make this upgrade worthwhile for me.

Knowledge Base article 297112, “BUG: Visual Basic Compiler Pads Embedded Resources to Align on 32-Bit Dword Boundaries”, is an old friend of mine. I have an application that makes use of a number of XSLT stylesheets and W3C Schema Defs (XSD). In order to ease deployment issues I decided to load these documents into the app’s DLL as resources. When I tested this, some documents would work, but others would not. At the time (a couple of years ago), this article was not in the KB (or at least, I was unable to find it. Fortunately, I was able to work out the solution on my own. Every now and then, I forget to check the file sizes when build a maintenance release, and run into the problem again. Nice to see it’s finally being addressed.

The other fixed bug that caught my attention is KB Article 312218, “BUG: Deadlock in Multithreaded Process If You Use Declare Statements for APIs in Visual Basic ActiveX .dll Files or .ocx Files.” The short version is this: VB6-authored DLLs which use the Declare statement to access API functions can deadlock things like IIS and MTX. If you’re using such a DLL under IIS, it’s running under one of these two executables. I don’t know how long this problem has been documented, but it may be the answer to many untraceable, non-reproducable issues I’ve had. The problems occur on a production IIS webserver that uses a custom COM component written in VB6 for mainframe access. I think I’ll be putting a new build into production soon. This one really ticks me off, since it I’ve been having these problems for a long time, and since it affects such core techonologies… the ones Microsoft spent so many years convincing me to use (before .NET came along and changed the rules. Again.)

If you, like me, have any ongoing interaction with VB6 or VC++6, give this service pack a look. If not, well, congratulations.