Archive for the 'Perl' Category

Note: I've reorganized this site to use tags; the category archive remains to support old links. Only posts prior to April, 2006 are categorized. Tag Archive »

Perl Module Tips

This a bit of pre-emptive blogging, for the next time I forget. Those of you with stronger Perl-fu than I already know these things.

How to check if a module is installed:

No output indicates success : perl -MMODULENAME -e1

For example: perl -MHTML::Embperl -e1

How to check the version number of an installed module:

Assumes the module uses $VERSION, but then, most CPAN modules do) : perl -MMODULENAME -e’print “$MODULENAME::VERSION\n”;’

For example: perl -MHTML::Embperl -e’print “$HTML::Embperl::VERSION\n”;’

How to determine where a module is installed:

Lists every dir used by the module, including man pages, etc. Should be entered on one line. :

perl -MExtUtils::Installed 
  -e'$,="\n";print ExtUtils::Installed->new()->directories("MODULENAME")," "'

For example:

perl -MExtUtils::Installed
  -e'$,="\n";print ExtUtils::Installed->new()->directories("HTML::Embperl")," "'

This technique relies on ExtUtils::Installed, which is part of the standard Perl distro these days.

Senseless Acts of Perl

At work I spend alot of time working on one of our Solaris dev servers via xterm. Via many xterms simultaneously, most of the time. Since I run a local X client on my PC under cygwin, I have a shell script that I run locally that connects to the dev box and launches three xterms in pre-determined screen locations, setting DISPLAY along the way.

Over the course of a busy morning, this number can grow. Since I’m still on a Windows PC, however, I do tend to use my task bar to find windows. Having six or more taskbar buttons that all say “xterm” isn’t very helpful. For a while I tried setting my titles to reflect what I’m doing in each xterm, but this futile. Partially because I often create, destroy, or repurpose xterms on a whim; but largely because I’m lazy.

A while ago, I updated my launch script to label my initial three windows Alpha, Beta, and Gamma. While the names aren’t very descriptive, it does differentiate the windows, and I can usually remember what each window is being used for. When I start launching additional xterms, things can get confusing; I try to remember to add a -title and pick a Greek letter not in use, but I did mention I’m lazy, right? So today, I decided to do something about it.

The result is addterm, one of the more senseless perl scripts I’ve ever bothered with. When run, it creates a new xterm with the title set to the name of the first greek letter not currently in use. If all 24 greek letters are in use, and error message is printed and no xterm is launched. This is a feature, not a bug. Close some windows! The version below is my OS X port. :

#!/usr/bin/perl -w

my $user = `whoami`;
my @ps = split("\n", `ps -o command -U $user`);
my @alpha = qw/Alpha Beta Gamma Delta Epsilon
               Zeta Eta Theta Iota Kappa Lambda
               Mu Nu Xi Omicron Pi Rho Sigma
               Tau Upsilon Phi Chi Psi Omega/;
my $k=0;
my %greek = map {$_=>$k++} @alpha;

for(@ps) {
    my ($title) = /^xterm\s+-title\s+([^\s]+)/ or next;
    $alpha[$greek{$title}]=) {
        $next = $_;
        last;
    }
}

if (defined $next) {
    open STDERR, '>/dev/null'; #discard xterm's whining
    system("xterm -title $next & ");
} else {
    print STDERR "ERROR: No greek letters free!\n";
}

This required a port from the original Solaris version because the script uses ps to look for running xterms. The Solaris version uses ps -o args -u $user. The command should list (only) the full command + args for every process for the username $user. If you want to use this on another *nix, just test your ps command first and adjust accordingly. You could also change the Greek letters to another finite set, just remember to update the error message.

Of dubious interest is that fact that I used an array to keep the letters in order and a hash to allow quick indexing into the array. I dislike having to store the letters twice, but this seemed the best solution. I have a vague sense that some kind of tied vars may do this more elegantly, but my perl-fu isn’t quite that strong without cracking the Camel; did I mention I’m lazy? Perhaps tommorow. Improvements welcomed.

rtf2html.pl

I recently needed to convert some RTF stored in a database to html (xhtml)… or least into xhtml fragments that could be wrapped inside a tag. I only needed to support bold, italic, underline, and paragraphs; fonts, page layout, etc. could just get chucked. The result is rtf2html.pl. Be sure to read the disclaimer at the top.

It’s a quick-and-dirty hack. It’s probably too verbose, and misses common Perl idioms. On the plus side, it works (always a plus). If you’re an experience perl guru and see anywhere I should have used a standard perl idiom, please drop a comment. I’m not looking for obfuscation-contest entries, just things I’m doing the hard (or verbose) way.

Prior Art

I’ve been working on an idea for a new plugin for Blosxom. Along the way I’ve learned a few things about prior art and code re-use. The idea for the plugin is simple. Next to the date banner above each day’s set of posts (generated by the ‘date’ flavour component), I’d like to add text denoting if the day is a holiday, observance, etc.

Of course, following that age-old Programmer’s virtue of Lazyness, I don’t want to have to maintain the list of dates and observances if I don’t want to. This just screams of the need for prior art… I need to find an existing format for calendar-type data, preferably a format with lots of existing data already, well, formatted and ready for consumption. I’m ashamed to say it took a bit of digging around the web before I came upon the perfect thing… which was sitting in my Mac’s dock the whole time.

Apple’s iCal, a free download for OS X 10.2+ users, features the ability to subscribe to calendars other people publish. There are calendars of movie releases, professional sports schedules, holidays and religious observances from around the globe; you name it. iCalShare has hundreds of calendars freely available. Stands to reason the spec is open, right?

Is it ever. Much to my glee, I find that iCal uses iCalendar, also known as RFC2442, the Internet Calendaring and Scheduling Core Object Specification (catchy, n’est pas?) Not only that, but other clients exist, such as Mozilla Calendar. Life is good.

However, things are about to veer off course a bit. In looking for a Perl module to read the iCal format (mime type text/calendar, or *.ics), I found a few options. Date::iCal seemed perfect at first, until I found that it only handles iCal’s date/time format (e.g. 20030921T235900) and duration format (e.g. P2D1H30M). It doesn’t actually parse the files, extract events, etc. Net::iCal, and a host of related files, seem to fit that purpose. However, there are some issues here as well:

  1. Version is 0.15
  2. Listed as ‘PRE-ALPHA’
  3. No activity in about 2 years.
  4. More prerequisites modules than I can count

Brief side note: I tried to make my life easy by using the CPAN module to grab the modules I wanted. It’s supposed to make life easy by handling build process, prerequisites, etc. However, every time I tried to configure and run it, it would beg for files. “Please install Net::FTP quickly!” it would shout. I tried to give it what it wanted, but every time I’d start installing a module, the prerequisite processing would end up trying to build and install perl5.8. Say What?? I’m running 5.6; that’s the latest for OS X from Apple, and Fink doesn’t offer 5.8 either. I’m happy with 5.6 for now. And yet, no matter what I tried, CPAN kept trying to build perl5.8. I followed the prompts for a bit before aborting, it was really going to build it from scratch.

The moral of the side note? I ended up installing each module I needed manually, i.e. perl Makefile.pl, (go get a bunch of prerequisites and install ’em) make, make test, make install. The whole prerequisite experience made for a very recursive exercise. I eventually came to the conclusion that even if the code worked perfectly, the raft of prerequisite modules made it inappropriate for use in a Blosxom plugin.

The Net::iCal family of modules was the product of a project called Reefknot. The list archive was dead for about 6 months, but I took a shot and mailed the dev list, looking for some info on the project’s status and future. I did get a reply, pointing me to datetime.perl.org for current work on Date/Time handling in perl (including iCal formats), and to Net::vFile and related modules for handling ‘vFile’, the meta-format of iCalendar, vCard, etc.

I spent some time playing with Net::vFile. I didn’t do too poorly; admitedly my week OO perl skills slowed me down. I eventually got some simple test code almost-working; it appears that iCalendar uses nesting within the vFile format in a way which is not yet fully implemented by vFile.

At this point, I decided to roll my own simple ics file parser. My needs are simple, I just want the start and end dates for ‘events’ as they are called in iCalendar, and the summary (description). Other calendar objects, like todo’s, I can ignore; other event properties, like UID and DATESTAMP I can likewise ignore. It didn’t take to long to come up with some code to extract a list of holidays from a US Holiday file published by Apple. Well, it did take a while, but only because I’m a numbskull, see the prior post for details. I even tossed back in use of Date::ICal, to parse the date format for me.

Once I could extract the events I wanted from an ics file, I ran into (yet) another snag: RRULEs. An RRULE is a Recurrence Rule. Most of the Holidays in my file were listed with 2002 dates, and with RRULEs describing how to calculate dates in successive years. Date::iCal doesn’t do RRULEs. A couple sample RRULES (paired with the SUMMARY of the event):

SUMMARY:Daylight Saving Time Ends
RRULE:FREQ=YEARLY;INTERVAL=1;BYDAY=-1SU;BYMONTH=10

SUMMARY:Halloween RRULE:FREQ=YEARLY;INTERVAL=1;BYMONTH=10

These things aren’t rocket science, but there’s enough variation that I’d prefer to use a library (read: code I don’t have to write). I poked around datetime.perl.org, and found that DateTime::iCal will not only read ical-formatted date strings, it will also handle RRULEs, creating a DateTime::Set. Of course, I’ll also need the original DateTime module. Each of these has a few sub-modules. I’m nervous now; again, this is for a Blosxom plugin and so should have minimal dependancies. Throwing caution to the wind, I grab all three downloads and begin to install. My first try, DateTime, stops me with no less than 4 dependancies that I don’t have. Two of these are part of the DateTime family, but there is no bundle available yet.

So now I’m back on familiar turf. There are modules to do what I need, but the amount of prerequisite modules is becoming prohibitive. But what are prerequisite modules? The use of prior art.

I’m all for code reuse. Really I am. But it seems that in Perl these days, deciding to use just one module that doesn’t ship with Perl can mean a landslide of prerequisites. For use in a hosted environment such as many of us use for Blosxom, this can be a real problem.

For now, I’m undecided. I hate to keep reinventing wheels; but on the other hand I hate to keep installing high-performance racing wheels with custom rims on a Yugo. The DateTime family of modules looks very well thought out, and very well implemented, especially given its youth. Hopefully, an install bundle will come along soon. But, just as with the Net::iCal family of modules (and all of it’s prerequisites), there’s a whole lot of functionality I just don’t need. I feel like I’ve gone through alot of options just to implement a small bit of functionality. Maybe this is the curse of prior art… specs tend to be big, even when what you need is only a small part of one.

Abort

I’ve categorized this as Programming::Perl but I could just as easily have put it under Mistakes::Beginner::Really Stupid.

I was experimenting with a bit of Perl code to process a text file. The whole thing was only maybe 25 lines or so. Every time I’d run the code, the only output I’d get was a single line:

Abort.

I scratched my head and perused my code for quite a while. I didn’t see any place I was explicitly causing this, and it seemed like the most cryptic error I’d seen. A cursory search of perldoc.com and Perl Monks failed to shed any light. I assumed I must be abusing my I/O in some fashion (I am still a rookie in Perldom).

My perl debugging knowledge is even more limited than my Perl knowledge. At work I’ve used a TK-based debugger (something like ptkdbg, can’t remember), which is GUI-fied and fairly straightforward, but which I don’t seem to have on my Powerbook (note to self: find out what that is, and find a copy for home). Falling back on that most time-honored of debugging methods, the manual trace (read: liberal (ab)use of print statements), I found that my I/O was fine. The abort occured at the very end, when I tried to output (via Data::Dumper) the data structure I’d built from the file. Those of you with more perl-fu than I will see the problem immediately:

dump(\%ev);

I didn’t see it, so I searched perldoc.com for dump(). And then I learned. What I should have done was this:

print Dumper(\%ev);

This would have pretty-printed my hash full of hashes in all its nested glory. The call to dump(), on the other hand, asks Perl to immediately dump core and abort. Perl obliged.

At least now I know what “Abort.” in my perl output means.