Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.

Unix Review Column 62 (Jan 2006)

[suggested title: ``Generating object accessors'']

The traditional Perl object model is rather simple. Create a hash, use a reference to that hash as the object (appropriately blessed so that we can find the associated package of methods), and use the hash elements as the member variables. So for a class Rectangle, we write the constructor as:

  package Rectangle;
  sub new {
    my $class = shift;
    die if @_ % 2;
    my %args = @_;
    my $self = {
      width => $args{width} || 0;
      height => $args{height} || 0;
    };
    return bless $self, $class;
  }

We can invoke this (probably buggy) constructor as:

  my $small_rect = Rectangle->new; # 0 by 0

to use the default values for width and height.

  my $square = Rectangle->new(width => 10, height => 10);

to override those to set our own height and width. To this class, we can add methods for computing the total area, for example:

   sub area {
     my $self = shift;
     return $self->{width} * $self->{height};
   }

So far, so good. Typically, we'll also want a way to fetch the current object attributes (here, height and width), and perhaps update them if it's appropriate.

At first, it looks tempting to write code outside the class that functions similarly to the code inside the class:

  my $w = $square->{width}; # bad idea

But this is a bad idea, because it breaks the encapsulation of the object. What if the author of the class decides to switch to an array representation instead of a hash? Or wants to store the width with some sort of unit scaling (12 inches instead of 1 foot)?

Instead of directly accessing the internals of the object, the general rule is that the class author provides getters and setters (often called accessors) to fetch and store the values, like so:

  sub width {
    my $self = shift;
    return $self->{width};
  }
  sub set_width {
    my $self = shift;
    $self->{width} = shift;
  }
  sub height {
    my $self = shift;
    return $self->{height};
  }
  sub set_height {
    my $self = shift;
    $self->{height} = shift;
  }

Now I can fetch the width as:

  my $w = $square->width;

Even if the Rectangle implementation is later changed to be an array-based class (or an inside-out class, or a prototyped object, and so on), we presume the class author will maintain a sensible width method to do the right thing for the external protocol.

But take a look at that code! It's repetitious. In fact, to write that code, I grabbed the first chunk, copied it and pasted it, and changed all width to height. If you ever find yourself using cut-n-paste in your editor, you're probably programming incorrectly.

But of course, this is Perl. Perl is good at string manipulation. Is there some way we can get Perl to do the right thing? Certainly: there are a few ways to do it, keeping with Perl traditions.

One straightforward way that requires very little imagination is to let Perl do the cutting and pasting at compile time:

  BEGIN {
    for my $member (qw(width height)) {
      eval qq{
        sub $member {
          my \$self = shift;
          return \$self->{$member};
        }
        sub set_$member {
          my \$self = shift;
          \$self->{$member} = shift;
        }
      }; die $@ if $@;
    }
  }

First note that the entire thing is wrapped within a BEGIN block. This means that this code will be executed at compile time, as if we had literally stuffed the subroutine definitions into the source text. The loop evaluates two strings, treating them as Perl code. The $member variable's value establishes the right subroutine names and hash element accesses.

Note that I had to backslash the other double-quote significant items here (all dollars, at-signs, and backslashes), because they would otherwise be expanded while building the string, instead of being treated as source text. Also note that if the eval fails, I want the compilation to die, because that's one of those ``should never happen'' problems.

While this is a workable solution, it's easy to make a mistake getting all the backslashes just right in the double-quoted string, leading to obscure errors. Also, because it's an eval, the line numbers of errors will have rather useless eval line numbers in them.

Another solution involves a bit more trickiness, but doesn't require the use of eval. We can construct a coderef that does the right thing, and then install it into the symbol table as if it was a named subroutine:

BEGIN { for my $member (qw(width height)) { no strict 'refs'; *{$member} = sub { my $self = shift; return $self->{$member}; } *{``set_$member''} = sub { my $self = shift; $self->{$member} = shift; } } }

Without eval, we can now write the code as normal Perl code, without worrying about the extra required level of backslashes. Once we've turned off the use strict checking for symbolic references, we can use the $member variable as a symbolic glob ref (the star syntax), and update the coderef portion of that symbol with our constructed coderef.

OK, so maybe it's a bit more magic, but it's actually more straightforward in Perl technology. We're not firing up the compiler repeatedly to evaluate a templated piece of text. Instead, we're installing a series of coderefs as named subroutines in the current package.

What if our object has a few dozen member variables, and we generally use only a few of the getters and setters in a given program? For this, I might choose a different approach, using AUTOLOAD.

When a method is called, and cannot be found, a search for a method named AUTOLOAD is made using the same inheritance path. If found, the method is invoked, passing the original method name (fully qualified with a package name) in the current package's $AUTOLOAD variable.

For example, if we didn't define the getters and setters as above, but instead defined:

  sub AUTOLOAD {
    my $self = shift;
    my ($method) = (our $AUTOLOAD) =~ /^.*::(\w+)$/
      or die "Weird: $AUTOLOAD";
    ...
  }

then we now have $method as the original method name. From here, we can either (1) act like the getter/setter, or (2) install a getter/setter, and execute it. Let's look at the first option:

  my %MEMBERS = map { $_ => 1 } qw(width height);
  sub AUTOLOAD {
    my $self = shift;
    my ($method) = (our $AUTOLOAD) =~ /^.*::(\w+)$/
      or die "Weird: $AUTOLOAD";
    if ($MEMBERS{$method}) { # getter
      return $self->{$method};
    }
    if ((my $m = $method) =~ s/^set_//) { # might be setter
      if ($MEMBERS{$m}) { # yes, it's a member
        return $self->{$m} = shift; # act like a setter
      }
    }
    ...
  }

Here, with the help of the %MEMBERS hash, we validate possible method names against our known member variables. If we have a getter method, the right value is returned. For a setter method (after we strip set_ from the method name), we update the right member variable.

The last step is what to do if we make it through this chain. Because we're in an AUTOLOAD, we might also be trapping methods that aren't implemented. So, we have to give the rest of the system a chance to respond to the method (perhaps with a parent-class AUTOLOAD). Probably the simplest way is to execute a SUPER:: call, so at the end of the subroutine, we add:

    $method = "SUPER::$method";
    return $self->$method(@_);

If there's no AUTOLOAD higher in the chain, we'll get blamed for trying to call something that didn't exist, but hopefully that will be shaken out in testing. At least we won't die silently.

If you don't want to do all this by hand, there are a few modules in the CPAN that can help, such as Class::Accessormaker, Class::MethodMaker, accessors, Spiffy, Class::MakeMethods, and so on. Check those out to see if something might suit you better, but don't be afraid to roll your own. Until next time, enjoy!

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.

Worldwide training and consulting by Perl experts

Copyright Notice

Unix Review Column 62 (Jan 2006)