Archive for the 'Perl' Category

Note: I've reorganized this site to use tags; the category archive remains to support old links. Only posts prior to April, 2006 are categorized. Tag Archive »

Wisdom of the Documentation

Today, I spent some time staring at an old piece of code that I had written at least a year ago. It’s been in testing several times, but never put into production (the project it is tied to has been bumped on several occaisions). Today, it was back in testing.

The code is a failry simple web service, written in Perl. I like Perl. I have no illusions that I’m a fantastic Perl hacker, but I know the language well, though both experience and reading. I’ve read most of the O’Reilly Perl titles, including Programming Perl (“The Camel”), which I’ve read cover to cover at least three times. I still find myself looking things up, usually to refresh my memory about something I can remember reading, or some syntax detail I can’t get right (one of the perils of working in multiple languages). At least I generally know where to look.

So this web service has been tested before. It works in a browser, and it works when called by my test client. It’s been tested with a third-party bit of code. Today, it was tested by Dave, using some custom client code he had written in C#. And it worked… if he told his http library to ignore HTTP protocol errors. If he didn’t, his library complained.

And so I stared at the code for a while. By coincidence, I’d been reading the HTTP Spec over the weekend (yes, I’m a geek), and was pretty sure my response was good- a bare-minumum response, along the lines of:

HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8

Single Line Response

I double checked the spec anyway, and kept staring at the code. I was about to start grasping at straws and adding additional entity headers to the response (such as Content-Length), when I finally stared at the code long enough. I saw something like this:

print "HTTP/1.1 $status\n"
print "Content-Type: text/plain; charset=utf-8\n"

Then it hit me- "\n" in perl is a “magic newline”- it conforms to the newline convention on the system in question. HTTP, on the other hand, requires ASCII CR+LF (Cariage Return + Line Feed, or "\r\n" in C) as a line terminator. Apparently all of the code thrown at the service before today was a bit forgiving. I changed the strings to send CRLF using octal escape sequences ("\015\012"), and everything was fine. I was a bit ticked about the mistake… I new both the HTTP requirement for CRLF and the Perl treatment of "\n" when I originally wrote the code; it was a dumb mistake. Also aggravating that it took so long to spot.

And there my tale should end. But this evening, I started wondering if the octal sequence was the most Perlish way to send a CRLF. I knew that "\r" is fine for the CR, but you can’t use "\n" for the LF – it’s magic in perl, and behaves differently on different platforms. I began to wonder if Perl has a backslash-escape for LF that is always LF. Eventually, I had to check for myself, so I referred to the Quote and Quote-like Operators section of the perlop man page. (Sadly, I knew right where to look, right down to the name of the section. Geek, remember?)

Turns out the manpage specifically recommends the octal form for networking applications (at least I got that right), but then it twists the knife:

If you get in the habit of using "\n" for networking, you may be burned some day.

D’oh.

Stupid Perl Debugger Tricks

I’ve been playing with Python alot lately in my spare time, but I still use mostly Perl at work. One of the handy things about Python is the interactive mode; I like it so much I even cloned it for Perl some time ago. Even without my Perlthon script, you can get a quick approximation in perl using the perl debugger and a command-line script:

perl -de1

(That’s a one, not an el.) The above will invoke perl with the debugger (-d), debugging a very simple script (-e1, which is to say -e '1;'). Once the debugger starts, you can just type perl statements, and can use x <expr> to inspect values.

Whichever way you play with interactive Perl, testing regular expressions can be a pain. It’s not too bad under Python, given Python’s use of match objects:

import re
re.search('regex', 'string').group(0)

The second line runs the regex against the string, and prints the entire match. If your regex doesn’t match at all, you get an error, but that’s fine (and self explanatory). If your regex doesn’t perform as expected, repeated attempts make it easy to triangulate. If you have Python’s readline module installed, you can just hit UpArrow after the test, tweak the regex, lather, rinse, repeat. I wanted the same flexibility with interactive Perl; it turns out to be trivial:

# perl debugger version
x ('string' =~ /regex/, $&)[-1]

# perlthon version
('string' =~ /regex/, $&)[-1];

The regex match operation will return a list of matched groups, which can be handy at times. For testing a complicated regex, I often just want to see the whole match to be sure I’m getting what I expect. The array notation accomplishes this nicely.

Authentication Headers and IIS

After spending more time than I care to admit working on a problem I actually solved in May, I decided I’d better blog this before I forget again.

Using ActivePerl, it is possible to run perl CGI scripts under Microsoft IIS. In a default installation, the extention to use is .plx (not .cgi), which is mapped to run under PerlIS.dll, the “Perl for ISAPI” implementation. Works pretty well. The issue I had was with HTTP Authentication. If you want to handle your own authentication in a CGI script, you can check the Environment variable HTTP_AUTHORIZATION. For example:

binmode(STDOUT, ":utf8");  #you know you should

my $have_authinfo = (defined($ENV{HTTP_AUTHORIZATION}) 
    and (substr($ENV{HTTP_AUTHORIZATION},0,6) eq 'Basic '));

my ($user, $pass) = ('','');
if ($have_authinfo) {
    my $decoded = decode_base64(substr($ENV{HTTP_AUTHORIZATION},6));

    if ($decoded =~ /:/) {
        ($user, $pass) = split(/:/, $decoded);
    } else {
        $have_authinfo = 0;
    }
}

if (!$have_authinfo or !Authorize($user, $pass)) {
    print << "EOF";
HTTP/1.1 401 Authorization Required
WWW-Authenticate: Basic realm="Example.com"
Content-Type: text/plain; charset=utf-8

You must supply valid credentials to access this resource

EOF
    close STDOUT;
    exit 0;
}

#Authorized, continue with web page....

sub Authorize {
    my ($user, $pass) = @_;
    #Do something to authenticate, return true/false
}

Of course, I’m relying on Basic authentication, which you should only do if your script will only be available via HTTPS (as is mine). The whole thing is dependant on $ENV{HTTP_AUTHORIZATION}, which by default won’t actually get passed to your script under IIS and PerlIS.

Fortunately, fixing this is simple, if a bit non-evident. In the IIS Management Console, navigate to the folder containing your script, and select the script. Right-click and choose properties. On the File Security tab of the properties dialog, click the Edit button under “Anonymous Access”. On the next dialog, make sure that “Annonymous Access” is checked and that no other authentication method is checked. By default, Windows Integrated Authentication is selected, which makes IIS snoop around the header, and apparently lose it.

For multiple scripts, put them all in one location (go on, call it cgi-bin), and make the same changes above to the whole folder. New scripts created in this folder should inherit the settings.

Perlthon

I’ve been writing alot more Perl at work lately, which suits me fine. With my Perl still a bit rusty, however, I found myself dashing off lots of -e one-liners to test various bits. After a while, I wanted a quicker way to test things…an interactive mode such as Visual Basic’s immediate window or Python’s interactive mode.

Perl being perl, this sort of thing is crazy easy. As most perl programmers know, or can quickly figure out, the fastest way to get an interactive perl session is: perl -ne eval;

which will eval every line of input until input ends (CTRL+D on *nix systems) or you type exit (as exit is a perl builtin which does just that). This command has no explicit output, you’ll need to include your own print calls to see output. Still… quick, dirty, and handy. A slight tweak if you’ll be printing most everything you feed it: perl -ne ‘print eval; print “\n”‘

This autoprints everything. Note the use of two print statements; this is on purpose. If instead we use print eval() . "\n", the eval will be called in scalar context. Try printing localtime, if you see a nicely formated date, scalar context is to blame. If you want such a thing, just print scalar localtime. Best of both worlds. The other choice for combining print statements is print eval, "\n". This calls eval in list context, however, it also passes "\n" to print as part of the list. This means that using: $,=”,”; localtime;

to print the localtime array with commas will add a comma to the end of the line as well (before the "\n").

Never content to leave well enough alone, however, I wanted more. Command history and in-line editing. Multi-line command entry (a la Python). A help function. A quit function that doesn’t (a la Python). You know, toys. The result is Perlthon, an interactive Perl session that works like Python’s interactive mode. It’s easier to show than to tell, so here’s a sample session:

$> perlthon
Perlthon running Perl 5.8.0
Type "help;" for help, "exit;" or press CTRL+D to exit
>>> help;
Perlthon, the Interactive Perl Interpeter v0.1
by Jason Clark  <jason@jclark.org>
This is Free Software.

Enter commands for perl to evaluate, a la interactive Python.
Lines without a ; are continued on next input.
By default, the result of each evauation is printed.  To disable, 
use this: "$AUTOPRINT=0;"

Use "exit;" or CTRL+D to exit.

Prompts are controlled by $PROMPT1 and $PROMPT2, which you can change.

>>> localtime;
59371729810432721
>>> $, = ",";
,
>>> localtime;
10,38,17,29,8,104,3,272,1
>>> scalar(
...    localtime
... );
Wed Sep 29 17:38:30 2004
>>> quit;
Use "exit;" or CTRL+D to exit.
1
>>> exit;
$>

Bugs: Oh, you betcha. Weird behavior when Term::Readline has to fake it and you’ve changed $,. Also, because Perlthon looks for ; at the end of line to know when it’s time to eval, entering a multiline sub is a pain. You can beat it by ending each line with a comment (#). Of course, this could be considered another bug… semi-coloned lines ending with comments don’t run unless the comment ends with a semicolon. I’d like to have a block mode (if inside {} then ; doesn’t end multiline input), but this is trickier than it looks. Consider: do { #comment with a } foo; }

But hey, works for now.

Howto treat a Perl Scalar like a Filehandle

This morning, for the second time in as many weeks, I banged my head against a Perl problem I thought should be simple: I wanted to open a filehandle against a scalar (or perform some other chicanery) so that if I have a scalar $text full of text, I could do this:

while (<$fh>) {
    #do something with $_
}

Where the filehandle $fh would refer to the contents of the scalar $text. This seems like an obvious thing someone may want to do. While you could just as easily do something like foreach (split /\n/,$text) { #... }, I had a situation where I might have data in a scalar or in a file (or even STDIN) and I wanted to treat them all the same. I expected this would be covered in the Camel and/or the Cookbook, but I couldn’t find any such thing. I didn’t have much luck on the web either. In the end, I took the low road and faked it with a system call and a pipe, since I the script I was working on is infrequently used. Here’s that version:

open(FH, qq[echo "$text" |]) or die "Can't pipe.";
while (<fh>) { #... }

When the same problem came up this morning in a web service I’m working on, I decided to try researching it again, since I really didn’t like the pipe solution. After alot of digging, I came across the perl module IO::Scalar, which lets you do exactly what I wanted. The IO:Scalar version of the code looks like this:

use IO::Scalar;
my $fh = new IO::Scalar \$text;
while (<$fh>) { #... }

My test app worked nicely on the Unix development server, but then I realized I still had a problem. For some period of time, this webservice will be running on a Windows2000 web server (don’t ask). IO::Scalar is not part of the standard Perl distro. I’m using ActiveState’s ActivePerl, which makes installing Modules via perl -MCPAN -e shell, well, challenging. ActiveState has a nice Perl Package Manager; unfortunately I could find no ready-made package for IO::Scalar, so I was stuck. While stumbling around the ActiveState site looking for inspiration, I found PerlIO::scalar. Now I was on to something.

(A moment for a side note here. I would write far fewer lines of Perl code per hour if not for the fantastic Perldoc.com maintained by Carlos Ramirez. It is an absolutely indispensible resource for me. However, it’s a bit buggy. I can’t get it to let me search the perl 5.8.0 docs, and the 5.8.4 docs appear incomplete (missing standard modules). After finding the info on ActiveState’s site, I figured out how to get to it on Perldoc.com.)

According the docs for PerlIO::scalar:

PerlIO::scalar only exists to use XSLoader to load C code that provides support for treating a scalar as an “in memory” file.

The docs on the ActiveState site also note that it’s not necessary to use PerlIO::Scalar. This reduces the code to the following:

open $fh, "<", \$text;
while (<$fh>) { #... }

Excellent. Not only is it concise and easy to use, it’s part of the standard distro. I’ve documented this not only for my own future reference, but to add my bit to the world’s largest help database. I just hope I wasn’t being totally obtuse, to later find that this is Camel page 52 material.