Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Perl Journal magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Perl Journal Column 12 (May 2004)
[Suggested title: ``Eight million ways to die
'']
What is that old saying? ``The best-laid plans of mice and men...'' And like that old saying, sometimes your programs don't go the way you expect.
For example, a user might not enter a number between one and five, even though your prompt carefully suggests that they do. Or maybe the file you expected to create in that directory can't be created. Or the database connection fails to connect (and was it because the system was down, or because you were given a bad password?). Or the module you needed for a particular part of the application failed to load (maybe because it was never installed).
Usually, when things go wrong, you'll want to know about it, and do something else in your program in response. For example, consider a program that updates a data file, incrementing a value:
open OLD, "counter"; open NEW, ">counter.tmp"; print NEW <OLD> + 1; close OLD; close NEW; rename "counter.tmp", "counter";
Note that we have six lines of code, any of which could fail. Let's
take the simplest ones first. If the first open
fails, we'll be using
a closed filehandle in the third line, which will look like a 0
to the
``add 1'' operation, and we'll get a ``1'' value in the final file.
Now, this might actually make sense for this application: the first
invocation of the program yields a 1
value. But, if we have
warnings enabled, we'll get a warning when we attempt to read from a
closed filehandle on the third line, because that's generally
considered bad style, if also not a more serious error. We could
notice the return value from that open
, and rewrite the code like
this:
my $old_value = 0; if (open OLD, "counter") { $old_value = <OLD>; close OLD; } open NEW, ">counter.tmp"; print NEW $old + 1; close NEW; rename "counter.tmp", "counter";
And this does solve the extraneous warning. We now take an alternate execution path if the file is not initially present, thus nicely sidestepping the warning.
But what if the open
failure is from something more serious than
``file not found''?. My Unix open(2)
manpage lists about a dozen
different reasons for a failure, including esoteric things like ``a
symbolic link loops back onto itself''. How do we distinguish those?
The error variable $!
starts to look pretty interesting. For example,
we can distinguish between ``good'', ``file not found'' and ``everything else''
by a three way branch:
if (open OLD, "counter") { # good } elsif ($! =~ /file.*not found/) { # not found, default to 0 } else { # everything else }
Because I'm using $!
in a string context, I get to see the string-ish
error message. This is fairly operating-system specific, but if you're
not trying to be portable across a wide variety of systems, you can get
away with such matches.
Note that I'm testing $!
only when I've had a failure, and
immediately afterward. This is the only time I can be sure
that there's really an error in there, because although an operating
system request failure sets $!
, nothing normally resets it. Thus,
this code is broken:
## BAD CODE DO NOT USE open OLD, "counter"; if ($! =~ /file.*not found/) { # file not found .. } elsif ($!) { # other error .. } else { # everything OK .. }
We're not necessarily testing the failed open
call here.. any prior
failed call might give us a false positive.
But now, we need to decide what to do if we get that unexpected error. There's an old joke amongst programmers: ``don't test for anything you aren't willing to handle'', but we can no longer plead ignorance here.
The most common solution is to abort the entire program, and let the
sysadmin on duty watch take care of it, and that's easy with die
.
Let's redesign our program so that a missing counter file is considered
a bad, bad thing, and abort the program if that first open
fails:
unless (open OLD, "counter") { die "Cannot open counter"; }
In this case, a false return value (for any reason) from open
triggers the die
, which aborts our program immediately. The error
message is sent to STDERR
(rather than STDOUT
) to ensure that
the message is not lost in a typical redirection to a file or pipe.
In addition, the filename and the line number are automatically
appended to the message, unless the message string ends in a newline.
This helps us find the source of the die
amongst many modules and
files.
Note that the error message contains the attempted operation as well. Again, this helps the debugging a bit, other than the cryptic ``died at line 14'' from the default message. This is especially handy when the filename for the operation might have come from another source:
chomp(my $filename = <SOMEOTHERFILE>); unless (open OLD, $filename) { die "Cannot open $filename"; }
Before making it a habit to include such information in my die
messages, I was occasionally confused about why my program was
failing, because I had presumed that a variable contained other than
what it did. Always echo the input parameters in the error message!
Another thing to include is that $!
I mentioned earlier. That
can help us figure out what kind of failure:
unless (open OLD, "counter") { die "Cannot open counter: $!"; }
And finally, this is too much typing. The or
operator executes
its right operand only when the left operand is false, so we can shorten
this to the traditional:
open OLD, "counter" or die "Cannot open counter; $!";
So, to fully instrument my original program, I could add or die ...
to each of the steps that might fail:
open OLD, "counter" or die; open NEW, ">counter.tmp" or die; print NEW <OLD> + 1 or die; close OLD or die; close NEW or die; rename "counter.tmp", "counter" or die;
Wait a second? Why am I checking the return value from print
? And
from close
? Those can't fail, can they? Certainly they can,
although this is probably one of the few times you'll see any
program that tests for them. The print
can fail if the filehandle
is closed, or if there's an I/O error, like a disk being full. And
the close
can fail if the filehandle is closed, or if the final
buffer being flushed at the time of the close couldn't be written
(again, typically from a full disk).
This seems like a lot of typing. Can we reduce this? Sure, with the
Fatal
module, part of the Perl core for recent versions of Perl.
We simply list the subroutines that should have an automatic or die
added, and away we go:
use Fatal qw(open close rename); open OLD, "counter"; open NEW, ">counter.tmp"; print NEW <OLD> + 1; close OLD; close NEW; rename "counter.tmp", "counter";
Now we have (nearly) the same program with a lot less typing. The
downside to this approach is that we don't really get to say what the
error message is, other than the default Died
. To get a bit more
control, I could add :void
to that argument list, and then any of
those calls that have an explicit testing for the return value will no
longer be fatal:
use Fatal qw(:void open close rename); open OLD, "counter" or warn "old value unavailable, presuming 0\n"; open NEW, ">counter.tmp"; print NEW <OLD> + 1; close OLD or "ignore"; close NEW; rename "counter.tmp", "counter";
Why didn't I list print
here? Well, Fatal
uses some magic
behind the scenes, and print
resists this magic. Oops. We'll have
to do that one by hand.
The die
operator is fatal to the program, unless it is enclosed
within an eval
block (or by a __DIE__
handler, but I digress).
Once safely within the eval
block, any die
aborts the block,
not the program. Immediately following the block, we check the
$@
variable, which is guaranteed to be empty if the block executed
to completion, or the text message that would have been sent to STDERR
if we would have otherwise aborted. Time for an example:
use Fatal qw(:void open close rename); for my $file (qw(counter1 counter2 counter3)) { eval { open OLD, "$file" or warn "old value unavailable, presuming 0\n"; open NEW, ">$file.tmp"; print NEW <OLD> + 1; close OLD or "ignore"; close NEW; rename "$file.tmp", "$file"; }; if ($@) { print "ignored error on $file (continuing): $@"; } }
Here, I've put the previous code inside the eval
block, using
$file
in place of the literal filenames. If any of the steps
within the eval
block fail, we skip immediately to the end of the
block. The message ends up in $@
. If the message is present, we
note it on STDOUT
. Whether there was an error or not, we're
continuing the loop.
Now suppose we conclude that any permission denied
message inside
the eval
block is likely to mean we're not going to get much
further on the rest of the program. We can take different actions
based on the value within $@
. For example:
use Fatal qw(:void open close rename); for my $file (qw(counter1 counter2 counter3)) { eval { open OLD, "$file" or warn "old value unavailable, presuming 0\n"; open NEW, ">$file.tmp"; print NEW <OLD> + 1; close OLD or "ignore"; close NEW; rename "$file.tmp", "$file"; }; if ($@ =~ /permission denied/i) { die $@; # rethrow $@ } elsif ($@) { print "ignored error on $file (continuing): $@"; } }
If the message in $@
after the loop matches permission denied
,
we rethrow the error. In this case, there's no outer eval
block, so the program aborts. However, had there been an outer
eval
block, we'd simply pop out one more level. In turn, that
outer block could handle the error, or rethrow it again to the next
level (if any), and so on.
Matching the specific text of error messages can be a bit problematic,
especially when you have to change the text for internationalization
of your program. Fortunately, modern versions of Perl permit the
die
parameter to be a object, not just a text message. When an
object value is thrown with die
, the $@
value contains that
object as well. Not only does this let us pass structured data up the
exception-handling logic, we can also create hierarchies of error
classifications to quickly sort entire groups of errors apart.
The best framework I've seen for creating such error categories is
Exception::Class
, found in the CPAN. Let's restructure our
program to use exception objects rather than text testing:
use Exception::Class ( E => { description => "my base error class" }, E::User => { description => "user-related errors", isa => qw(E) }, E::File => { description => "file-related errors", isa => qw(E) }, E::Open => { description => "cannot open", isa => qw(E::File) }, E::Create => { description => "cannot create", isa => qw(E::File) }, E::Rename => { description => "cannot rename", isa => qw(E::File) }, E::IO => { description => "other IO", isa => qw(E::File) }, ); for my $name (@ARGV) { eval { $name =~ /^\w+$/ or E::User->throw("bad file name for $name"); open IN, $name or E::Open->throw("reading $name"); open OUT, ">$name.tmp" or E::Create->throw("creating $name.tmp"); print OUT <IN> + 1 or E::IO->throw("writing $name.tmp"); close IN or E::IO->throw("closing $name"); close OUT or E::IO->throw("closing $name.tmp"); rename "$name.tmp", $name or E::Rename->throw("renaming $name.tmp to $name"); }; if (UNIVERSAL::isa($@, "E")) { # an object error from my tree if ($@->isa("E::User")) { warn "Pilot error: $@"; # warn and continue } elsif ($@->isa("E::Create")) { $@->rethrow; # same as die $@ } elsif ($@->isa("E::File")) { # other IO errors warn "File error: $@: $!"; } else { warn "Uncategorized error: $@"; # warn and continue } } elsif ($@) { # a legacy die error die $@; # abort (possibly caught by outer eval } # else everything went ok }
The first lines (invoking Exception::Class
with parameters) create
a hierarchy of classes, starting with my E
class (selected because
the name is short). From E
, I break errors into two categories:
user-related errors, and file-related errors. File-related errors are
further categorized into various file operations. The isa
parameter defines the base class for each derived class, permitting
the use of normal isa
tests for quick categorization.
Now, inside the eval
, instead of a simple die
, I use the
throw
method of an appropriate error class, with a specific error
message. I won't need to include $!
here, because I'll know that
every error in the E::File
category was system-call related, and I
can put that just once in the error handler.
Finally, the error-handling logic just past the end of the eval
block is also changed. If $@
is an object derived from my &E;,
then I sort out what kind of error it might be. Note that I've chosen
to handle all E::Create
errors as relatively ``fatal'' to my loop
(although they might in turn be caught by some outer eval block not
shown here). User errors are distinguished from E::File
errors,
with the latter displaying the $!
value automatically. Also note
that any ``legacy'' errors (from an ordinary die
or maybe a reference
or object not within my hierarchy of classes) simply get rethrown as
well.
This framework is actually quite flexible, permitting additional
structured attributes to be carried along in the error object, as well
as having objects inherit from multiple class hierarchies to
distinguish multiple traits (file versus database, fatal versus
recoverable, and so on). If you're building a complex application,
you should definitely look in to using Exception::Class
, or
something similar. Until next time, enjoy!