Copyright Notice

This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.

Linux Magazine Column 98 (Oct 2007)

[suggested title: ``Many happy returns with Contextual::Return'']

In this column two months ago, I mentioned the Contextual::Return module to create a dual variable:

  use Contextual::Return;
  my $result = NUM { 13 } STR { "Permission Denied" };
  if ($result == 13) { ... } # true
  if ($result =~ /denied/i) { ... } # also true!

Such a result is fascinating to me on a couple of levels. First, I had not put together all the pieces of how this kind of a value was possible using relatively straightforward Perl. Second, the syntax for specifying the behavior is rather remarkable at first glance, and yet understandable once I spent a bit of time pawing through the code.

But before we drill down into the implementation, let's back up a minute, and look at the kinds of problems that Contextual::Return was designed to solve.

Many built-in Perl functions have related-but-distinct return values when invoked in either a scalar context or an list context. For example, the grep function returns a count of successful items in a scalar context, but the items themselves in a list context. And localtime returns a time string in a scalar context, or the elements that make up the time in a list context. I can write Perl subroutines that emulate this behavior by paying attention to wantarray (which really should be called ``wantlist''):

  sub my_funky_func {
    ...;
    return @some_list if wantarray; # list context
    return $some_scalar
      if defined wantarray and not wantarray; # scalar context
    ## "not defined wantarray" is "void context", don't return anything
    print "funky_func is done!\n"
  }

The trouble is that the tri-state return of wantarray is a bit obscure, so a quick use of Contextual::Return provides names for these three states:

  use Contextual::Return;

  sub my_funky_func {
    ...;
    return @some_list if LIST;
    return $some_scalar if SCALAR;
    # VOID would be true here
    print "funky_func is done!\n";    
  }

But the module does so much more. With a bit of syntax magic, we can create the multiway branch automatically:

  sub my_funky_func {
    ...;
    return
      LIST { @some_list }
      SCALAR { $some_scalar }
      VOID { print "funky_func is done!\n" }
    ;
  }

It looks like we have invented some entirely new syntax here, but in fact, it's really just a matter of properly prototyped subroutines. A subroutine that is prototyped as ;&$ will accept an optional block of code, and another optional argument after that:

  sub versive (;&$) { ... }
  versive; # no args
  versive { some code block }; # code ref
  versive sub { some code block }; # same thing
  versive { some code block } $scalar; # code ref + scalar

The Contextual::Return module uses such prototyped subroutines in a nested way. In that earlier code, the VOID subroutine is evaluated first, and returns an object that the SCALAR subroutine sees as its second argument. Each subroutine in the list modifies the return object so that it works appropriately with every new attribute, and then returns the updated object. Very cool.

At least, that explains how the nesting works. But how does the right value get selected? Well, even after staring at the code for a while, all I can say is my head hurts. It has something to do with scalar context being passed down to every subsequent element of the chain, and in those cases, a smart object is returned that does the right thing in the right context. Only the head of the chain might be in list or void context, and the smart object is evaluated in the proper fashion. If the head of the chain is evaluated in a scalar context, then the smart object (of type Contexual::Return::Value) itself is returned, to ``discover'' in the caller exactly how it is needed (boolean, number, string, or some kind of reference, as we'll see shortly).

The scalar return value is unique for an additional reason: the code block associated with SCALAR isn't executed until it is needed. This creates a lazy invocation. For example:

  my $x = SCALAR { print "executed\n"; 3 };

The block with print isn't executed immediately. In fact, the $x value can be passed around the program at will, as long as it is never needed as a boolean, numeric, string, or reference value. But once it has, the block is executed, and the return value is cached. As a reminder of this, we can use LAZY in place of SCALAR with no change in meaning.

We can choose to avoid the caching of the value by flagging the result as ACTIVE:

  my $now = ACTIVE SCALAR { localtime };

Each time $now is evaluated in a scalar context, the block is re-executed, returning a different timestring. This can be used for logging, for example:

  warn "$now: approaching memory limits";

Although the LIST and VOID values are executed immediately during evaluation, any scalar-context usage can further be distinguished into separate types. For example, we can return separate values depending on whether the scalar is used in a boolean, numeric, or string context:

  my $x = BOOL { 0 } NUM { 3.14 } STR { "pi" };
  unless ($x) { warn "it is false!" } # executed!
  print "The value is $x\n"; # The value is pi\n
  $y = $x + 0; # $y = 3.14

All three of these examples can be executed in sequence, because the value in $x remains an object even after being examined. Each block is executed lazily, however, caching the result of that type of return value. I can disable the caching with ACTIVE as before, causing the block to be repeatedly executed. For example:

  my $now = ACTIVE NUM { time } STR { localtime };

Now, each time I use $now in a string context, it's the current interpreted localtime string, but when I use it in a numeric context, it's the Unix epoch time value. Cool.

If I'd rather ``lock in'' the type, so it's not some sort of Schroedinger's Cat that may or may not be a numeric value, and collapses instead to the first observation, I can add FIXED:

  my $x = FIXED BOOL { 0 } NUM { 3.14 } STR { "pi" };

Now, the moment I use it in one of the three contexts, the other two contexts are revectored to the new value. For example:

  my $three = FIXED BOOL { 3 } STR { "three" };

If I use this first in a boolean context, it locks in as 3. Otherwise, it locks in as three. And by ``lock in'', I mean that it no longer has any magical properties... in the Harry Potter universe, it is now a muggle.

Besides boolean, numeric, and string contexts, we can also distinguish between various types of reference contexts. For example, I could create a value that acts appropriately whether I'm dereferencing it as a hashref or arrayref. Consider a file-stat subroutine that returns an array ref of stat values in an arrayref context, or can be used as a hashref to pick out stat items by name:

  my @STAT = qw(dev ino mode nlink uid gid
                rdev size atime mtime ctime blksize blocks);
  sub statify {
    my @stat = stat(shift) or return FAIL;
    return
      ARRAYREF { \@stat }
      HASHREF {
        my %stat;
        @stat{@STAT} = @stat;
        \%stat;
      }
    ;
  }

Now I can call this as:

  my $passwd = statify("/etc/passwd");
  my $n1 = $passwd->{size}; # select size via hashref
  my $n2 = $passwd->[7]; # select size via arrayref

And both work equally well. (If I had included FIXED, the second one would have failed, as it would have already locked in the arrayref nature of the return value.)

The FAIL line illustrates another feature of a Contexual::Return-enhanced subroutine. If FAIL is returned in a scalar context, an undef is substituted. However, if FAIL is executed in any other context, an exception is thrown. We can enhance the message of the exception with a parameter to FAIL:

  my @stat = stat(shift) or return FAIL { "stat failed: $!" };

This default behavior is heavily configurable: see the manpage for more details.

What if we had invoked statify in a list context, because there's no LIST block? In a nice interpretation of ``do what I mean'', the ARRAYREF value is automatically dereferenced for us:

  my @full_stat = statify("/etc/passwd");
  print $full_stat[7];

In fact, there are many automatic defaults for the various types of return values. I could have just as easily spelled out the LIST return:

  my @STAT = qw(dev ino mode nlink uid gid
                rdev size atime mtime ctime blksize blocks);
  sub statify {
    my @stat = stat(shift) or return FAIL;
    return
      LIST { @stat }
      HASHREF {
        my %stat;
        @stat{@STAT} = @stat;
        \%stat;
      }
    ;
  }

This will result in an arrayref of the LIST return when used in an arrayref context. The full list of return values includes:

    DEFAULT
        VOID
        NONVOID
            LIST
            SCALAR
                VALUE
                    STR
                    NUM
                    BOOL
                REF
                    SCALARREF
                    ARRAYREF
                    CODEREF
                    HASHREF
                    GLOBREF
                    OBJREF

This list is nested from the most general to the most specific, although you should consult the manpage for some interesting exceptions. The general routines will be used when a more specific routine is not available.

One problem that I've run into from time to time that is solved easily by Contextual::Return is how to perform an evaluation in the context of the caller, but still execute code after that evaluation. If a RESULT block is present, it overrides the normal ``last expression evaluated'' logic for subroutine return values, and also is executed in the context of the caller.

For example, suppose we want to execute a coderef as a transaction against a database handle:

  sub perform {
    my $dbh = shift;
    my $code = shift;

    return
      VALUE {
        $dbh->start_work;
        RESULT { $code->() };
        $dbh->commit;
      };
  }

In this case, the VALUE block is selected regardless of scalar or list context. The RESULT block is executed in the context of the caller, and sets the result value. If the result block throws an exception, the commit is skipped. Simple, and elegant. But not quite complete: we should probably revert the transaction. The presence of a RECOVER block causes all other blocks to act as if they had an eval wrapper. If any exception occurs, the $@ variable is set, and we end up in the RECOVER block, which can force a RESULT if needed:

  sub perform {
    my $dbh = shift;
    my $code = shift;

    return
      VALUE {
        $dbh->start_work;
        $code->();
      }
      RECOVER {
        if ($@) { # exception
          $dbh->revert; # roll back
          RESULT { FAIL }; # exception
        } else {
          $dbh->commit; # all good!
        }
      };
  }

I didn't have room to talk about the LVALUE support here, but hopefully I've whetted your appetite enough for you to consider Contextual::Return on your next tricky return value problem. Until next time, enjoy!

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.

Worldwide training and consulting by Perl experts

Copyright Notice

Linux Magazine Column 98 (Oct 2007)