Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Unix Review Column 62 (Jan 2006)
[suggested title: ``Generating object accessors'']
The traditional Perl object model is rather simple. Create a hash,
use a reference to that hash as the object (appropriately blessed so
that we can find the associated package of methods), and use the hash
elements as the member variables. So for a class Rectangle
, we write
the constructor as:
package Rectangle; sub new { my $class = shift; die if @_ % 2; my %args = @_; my $self = { width => $args{width} || 0; height => $args{height} || 0; }; return bless $self, $class; }
We can invoke this (probably buggy) constructor as:
my $small_rect = Rectangle->new; # 0 by 0
to use the default values for width and height.
my $square = Rectangle->new(width => 10, height => 10);
to override those to set our own height and width. To this class, we can add methods for computing the total area, for example:
sub area { my $self = shift; return $self->{width} * $self->{height}; }
So far, so good. Typically, we'll also want a way to fetch the
current object attributes (here, height
and width
), and perhaps
update them if it's appropriate.
At first, it looks tempting to write code outside the class that functions similarly to the code inside the class:
my $w = $square->{width}; # bad idea
But this is a bad idea, because it breaks the encapsulation of the object. What if the author of the class decides to switch to an array representation instead of a hash? Or wants to store the width with some sort of unit scaling (12 inches instead of 1 foot)?
Instead of directly accessing the internals of the object, the general rule is that the class author provides getters and setters (often called accessors) to fetch and store the values, like so:
sub width { my $self = shift; return $self->{width}; } sub set_width { my $self = shift; $self->{width} = shift; } sub height { my $self = shift; return $self->{height}; } sub set_height { my $self = shift; $self->{height} = shift; }
Now I can fetch the width as:
my $w = $square->width;
Even if the Rectangle
implementation is later changed to be an
array-based class (or an inside-out class, or a prototyped object, and
so on), we presume the class author will maintain a sensible width
method to do the right thing for the external protocol.
But take a look at that code! It's repetitious. In fact, to write
that code, I grabbed the first chunk, copied it and pasted it, and
changed all width
to height
. If you ever find yourself using
cut-n-paste in your editor, you're probably programming incorrectly.
But of course, this is Perl. Perl is good at string manipulation. Is there some way we can get Perl to do the right thing? Certainly: there are a few ways to do it, keeping with Perl traditions.
One straightforward way that requires very little imagination is to let Perl do the cutting and pasting at compile time:
BEGIN { for my $member (qw(width height)) { eval qq{ sub $member { my \$self = shift; return \$self->{$member}; } sub set_$member { my \$self = shift; \$self->{$member} = shift; } }; die $@ if $@; } }
First note that the entire thing is wrapped within a BEGIN block.
This means that this code will be executed at compile time, as if we
had literally stuffed the subroutine definitions into the source text.
The loop evaluates two strings, treating them as Perl code. The
$member
variable's value establishes the right subroutine names and
hash element accesses.
Note that I had to backslash the other double-quote significant items
here (all dollars, at-signs, and backslashes), because they would
otherwise be expanded while building the string, instead of being
treated as source text. Also note that if the eval
fails, I want
the compilation to die, because that's one of those ``should never
happen'' problems.
While this is a workable solution, it's easy to make a mistake getting all the backslashes just right in the double-quoted string, leading to obscure errors. Also, because it's an eval, the line numbers of errors will have rather useless eval line numbers in them.
Another solution involves a bit more trickiness, but doesn't require
the use of eval
. We can construct a coderef that does the right
thing, and then install it into the symbol table as if it was a named
subroutine:
BEGIN { for my $member (qw(width height)) { no strict 'refs'; *{$member} = sub { my $self = shift; return $self->{$member}; } *{``set_$member''} = sub { my $self = shift; $self->{$member} = shift; } } }
Without eval, we can now write the code as normal Perl code, without
worrying about the extra required level of backslashes. Once we've
turned off the use strict
checking for symbolic references, we can
use the $member
variable as a symbolic glob ref (the star syntax),
and update the coderef portion of that symbol with our constructed
coderef.
OK, so maybe it's a bit more magic, but it's actually more straightforward in Perl technology. We're not firing up the compiler repeatedly to evaluate a templated piece of text. Instead, we're installing a series of coderefs as named subroutines in the current package.
What if our object has a few dozen member variables, and we generally
use only a few of the getters and setters in a given program? For
this, I might choose a different approach, using AUTOLOAD
.
When a method is called, and cannot be found, a search for a method
named AUTOLOAD
is made using the same inheritance path. If found,
the method is invoked, passing the original method name (fully
qualified with a package name) in the current package's $AUTOLOAD
variable.
For example, if we didn't define the getters and setters as above, but instead defined:
sub AUTOLOAD { my $self = shift; my ($method) = (our $AUTOLOAD) =~ /^.*::(\w+)$/ or die "Weird: $AUTOLOAD"; ... }
then we now have $method
as the original method name. From here,
we can either (1) act like the getter/setter, or (2) install a
getter/setter, and execute it. Let's look at the first option:
my %MEMBERS = map { $_ => 1 } qw(width height); sub AUTOLOAD { my $self = shift; my ($method) = (our $AUTOLOAD) =~ /^.*::(\w+)$/ or die "Weird: $AUTOLOAD"; if ($MEMBERS{$method}) { # getter return $self->{$method}; } if ((my $m = $method) =~ s/^set_//) { # might be setter if ($MEMBERS{$m}) { # yes, it's a member return $self->{$m} = shift; # act like a setter } } ... }
Here, with the help of the %MEMBERS
hash, we validate possible
method names against our known member variables. If we have a getter
method, the right value is returned. For a setter method (after we
strip set_
from the method name), we update the right member
variable.
The last step is what to do if we make it through this chain. Because
we're in an AUTOLOAD, we might also be trapping methods that aren't
implemented. So, we have to give the rest of the system a chance
to respond to the method (perhaps with a parent-class AUTOLOAD).
Probably the simplest way is to execute a SUPER::
call, so at the
end of the subroutine, we add:
$method = "SUPER::$method"; return $self->$method(@_);
If there's no AUTOLOAD higher in the chain, we'll get blamed for trying to call something that didn't exist, but hopefully that will be shaken out in testing. At least we won't die silently.
If you don't want to do all this by hand, there are a few modules in
the CPAN that can help, such as Class::Accessormaker
,
Class::MethodMaker
, accessors
, Spiffy
, Class::MakeMethods
,
and so on. Check those out to see if something might suit you better,
but don't be afraid to roll your own. Until next time, enjoy!