Copyright Notice
This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Linux Magazine Column 98 (Oct 2007)
[suggested title: ``Many happy returns with Contextual::Return'']
In this column two months ago, I mentioned the Contextual::Return
module to create a dual variable:
use Contextual::Return; my $result = NUM { 13 } STR { "Permission Denied" }; if ($result == 13) { ... } # true if ($result =~ /denied/i) { ... } # also true!
Such a result is fascinating to me on a couple of levels. First, I had not put together all the pieces of how this kind of a value was possible using relatively straightforward Perl. Second, the syntax for specifying the behavior is rather remarkable at first glance, and yet understandable once I spent a bit of time pawing through the code.
But before we drill down into the implementation, let's back up a
minute, and look at the kinds of problems that Contextual::Return
was designed to solve.
Many built-in Perl functions have related-but-distinct return values
when invoked in either a scalar context or an list context. For
example, the grep
function returns a count of successful items in a
scalar context, but the items themselves in a list context. And
localtime
returns a time string in a scalar context, or the
elements that make up the time in a list context. I can write Perl
subroutines that emulate this behavior by paying attention to
wantarray
(which really should be called ``wantlist''):
sub my_funky_func { ...; return @some_list if wantarray; # list context return $some_scalar if defined wantarray and not wantarray; # scalar context ## "not defined wantarray" is "void context", don't return anything print "funky_func is done!\n" }
The trouble is that the tri-state return of wantarray
is a bit
obscure, so a quick use of Contextual::Return
provides names for
these three states:
use Contextual::Return;
sub my_funky_func { ...; return @some_list if LIST; return $some_scalar if SCALAR; # VOID would be true here print "funky_func is done!\n"; }
But the module does so much more. With a bit of syntax magic, we can create the multiway branch automatically:
sub my_funky_func { ...; return LIST { @some_list } SCALAR { $some_scalar } VOID { print "funky_func is done!\n" } ; }
It looks like we have invented some entirely new syntax here, but in
fact, it's really just a matter of properly prototyped subroutines. A
subroutine that is prototyped as ;&$
will accept an optional block
of code, and another optional argument after that:
sub versive (;&$) { ... } versive; # no args versive { some code block }; # code ref versive sub { some code block }; # same thing versive { some code block } $scalar; # code ref + scalar
The Contextual::Return
module uses such prototyped subroutines in a
nested way. In that earlier code, the VOID
subroutine is evaluated
first, and returns an object that the SCALAR
subroutine sees as its
second argument. Each subroutine in the list modifies the return
object so that it works appropriately with every new attribute, and
then returns the updated object. Very cool.
At least, that explains how the nesting works. But how does the right
value get selected? Well, even after staring at the code for a while,
all I can say is my head hurts. It has something to do with scalar
context being passed down to every subsequent element of the chain,
and in those cases, a smart object is returned that does the right
thing in the right context. Only the head of the chain might be in
list or void context, and the smart object is evaluated in the proper
fashion. If the head of the chain is evaluated in a scalar context,
then the smart object (of type Contexual::Return::Value
) itself is
returned, to ``discover'' in the caller exactly how it is needed
(boolean, number, string, or some kind of reference, as we'll see shortly).
The scalar return value is unique for an additional reason: the code
block associated with SCALAR
isn't executed until it is needed.
This creates a lazy invocation. For example:
my $x = SCALAR { print "executed\n"; 3 };
The block with print
isn't executed immediately. In fact, the
$x
value can be passed around the program at will, as long as it is
never needed as a boolean, numeric, string, or reference value. But
once it has, the block is executed, and the return value is cached.
As a reminder of this, we can use LAZY
in place of SCALAR
with
no change in meaning.
We can choose to avoid the caching of the value by flagging the
result as ACTIVE
:
my $now = ACTIVE SCALAR { localtime };
Each time $now
is evaluated in a scalar context, the block is
re-executed, returning a different timestring. This can be used
for logging, for example:
warn "$now: approaching memory limits";
Although the LIST
and VOID
values are executed immediately
during evaluation, any scalar-context usage can further be
distinguished into separate types. For example, we can return
separate values depending on whether the scalar is used
in a boolean, numeric, or string context:
my $x = BOOL { 0 } NUM { 3.14 } STR { "pi" }; unless ($x) { warn "it is false!" } # executed! print "The value is $x\n"; # The value is pi\n $y = $x + 0; # $y = 3.14
All three of these examples can be executed in sequence, because the
value in $x
remains an object even after being examined. Each
block is executed lazily, however, caching the result of that type of
return value. I can disable the caching with ACTIVE
as before,
causing the block to be repeatedly executed. For example:
my $now = ACTIVE NUM { time } STR { localtime };
Now, each time I use $now
in a string context, it's the current
interpreted localtime string, but when I use it in a numeric context,
it's the Unix epoch time value. Cool.
If I'd rather ``lock in'' the type, so it's not some sort of Schroedinger's
Cat that may or may not be a numeric value, and collapses instead to the
first observation, I can add FIXED
:
my $x = FIXED BOOL { 0 } NUM { 3.14 } STR { "pi" };
Now, the moment I use it in one of the three contexts, the other two contexts are revectored to the new value. For example:
my $three = FIXED BOOL { 3 } STR { "three" };
If I use this first in a boolean context, it locks in as 3
.
Otherwise, it locks in as three
. And by ``lock in'', I mean that it
no longer has any magical properties... in the Harry Potter universe,
it is now a muggle.
Besides boolean, numeric, and string contexts, we can also distinguish between various types of reference contexts. For example, I could create a value that acts appropriately whether I'm dereferencing it as a hashref or arrayref. Consider a file-stat subroutine that returns an array ref of stat values in an arrayref context, or can be used as a hashref to pick out stat items by name:
my @STAT = qw(dev ino mode nlink uid gid rdev size atime mtime ctime blksize blocks); sub statify { my @stat = stat(shift) or return FAIL; return ARRAYREF { \@stat } HASHREF { my %stat; @stat{@STAT} = @stat; \%stat; } ; }
Now I can call this as:
my $passwd = statify("/etc/passwd"); my $n1 = $passwd->{size}; # select size via hashref my $n2 = $passwd->[7]; # select size via arrayref
And both work equally well. (If I had included FIXED
, the second
one would have failed, as it would have already locked in the arrayref
nature of the return value.)
The FAIL
line illustrates another feature of a
Contexual::Return
-enhanced subroutine. If FAIL
is returned in a
scalar context, an undef
is substituted. However, if FAIL
is
executed in any other context, an exception is thrown. We can
enhance the message of the exception with a parameter to FAIL
:
my @stat = stat(shift) or return FAIL { "stat failed: $!" };
This default behavior is heavily configurable: see the manpage for more details.
What if we had invoked statify
in a list context, because there's
no LIST
block? In a nice interpretation of ``do what I mean'', the
ARRAYREF
value is automatically dereferenced for us:
my @full_stat = statify("/etc/passwd"); print $full_stat[7];
In fact, there are many automatic defaults for the various types
of return values. I could have just as easily spelled out the LIST
return:
my @STAT = qw(dev ino mode nlink uid gid rdev size atime mtime ctime blksize blocks); sub statify { my @stat = stat(shift) or return FAIL; return LIST { @stat } HASHREF { my %stat; @stat{@STAT} = @stat; \%stat; } ; }
This will result in an arrayref of the LIST
return when used
in an arrayref context. The full list of return values includes:
DEFAULT VOID NONVOID LIST SCALAR VALUE STR NUM BOOL REF SCALARREF ARRAYREF CODEREF HASHREF GLOBREF OBJREF
This list is nested from the most general to the most specific, although you should consult the manpage for some interesting exceptions. The general routines will be used when a more specific routine is not available.
One problem that I've run into from time to time that is solved easily
by Contextual::Return
is how to perform an evaluation in the
context of the caller, but still execute code after that evaluation.
If a RESULT
block is present, it overrides the normal ``last
expression evaluated'' logic for subroutine return values, and also is
executed in the context of the caller.
For example, suppose we want to execute a coderef as a transaction against a database handle:
sub perform { my $dbh = shift; my $code = shift;
return VALUE { $dbh->start_work; RESULT { $code->() }; $dbh->commit; }; }
In this case, the VALUE
block is selected regardless of scalar or
list context. The RESULT
block is executed in the context of the
caller, and sets the result value. If the result block throws an
exception, the commit
is skipped. Simple, and elegant. But not quite
complete: we should probably revert the transaction. The presence
of a RECOVER
block causes all other blocks to act as if they had an
eval
wrapper. If any exception occurs, the $@
variable is set,
and we end up in the RECOVER
block, which can force a RESULT
if needed:
sub perform { my $dbh = shift; my $code = shift;
return VALUE { $dbh->start_work; $code->(); } RECOVER { if ($@) { # exception $dbh->revert; # roll back RESULT { FAIL }; # exception } else { $dbh->commit; # all good! } }; }
I didn't have room to talk about the LVALUE support here, but hopefully
I've whetted your appetite enough for you to consider Contextual::Return
on your next tricky return value problem. Until next time, enjoy!