Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Unix Review Column 63 (Mar 2006)
[suggested title: ``Inside-out Objects'']
In [my previous article], I created a traditional hash-based Perl object: a Rectangle with two attributes (width and height) using the constructor and accessors like so:
package Rectangle; sub new { my $class = shift; my %args = @_; my $self = { width => $args{width} || 0; height => $args{height} || 0; }; return bless $self, $class; } sub width { my $self = shift; return $self->{width}; } sub set_width { my $self = shift; $self->{width} = shift; } sub height { my $self = shift; return $self->{height}; } sub set_height { my $self = shift; $self->{height} = shift; }
I can construct a 3-by-4 rectangle easily:
my $r = Rectangle->new(width => 3, height => 4);
At this point, $r is an object of type Rectangle, but it's also simply
a hashref. For example, the code in set_width
merely deferences a
value like $r to gain access to the hash element with a key of
width
. But does Perl require such code to be located within
the Rectangle package? No. As a user of the Rectangle class,
I could easily say:
$r->{width} = 5;
and update the width from 3 to 5. This is ``peering inside the box'', and will lead to fragile code, because we've now exposed the implementation of the object, not just the interface.
For example, suppose we modify the set_width
method to ensure that
the width is never negative:
use Carp qw(croak); sub set_width { my $self = shift; my $width = shift; croak "$self: width cannot be negative: $width" if $width < 0; $self->{width} = $width; }
If the $width is less than 0, we croak
, triggering a fatal
exception, but blaming the caller of this method. (We don't blame
ourselves, and croak
is a great way to pass the blame.)
At this point, we'll trap erroneous settings:
$r->set_width(-3); # will die
But if someone has broken the box open, we get no fault:
$r->{width} = -3; # no death
This is bad, because the author of the Rectangle class no longer controls behavior for the objects, because the data implementation has been exposed.
Besides exposing the implementation, another problem is that I have
to be careful of typos. Suppose in rewriting the set_width
method, I accidentally transposed the last two letters of the hash key:
$self->{widht} = $width;
This is perfectly legal Perl, and would not throw any compile-time or
run-time errors. Even use strict
isn't helping here, because I'm
not misspelling a variable name: just naming a ``new'' hash key.
Without good unit tests and integration tests, I might not even catch
this error. Yes, there are some solutions to ensure that a hash's
keys come from only a particular set of permitted keys, but these
generally slow down the hash access significantly.
We can solve both of these problems at once, without significantly
impacting the performance of our programs by using what's come to be
known as an inside-out object
. First popularized by Damian Conway
in the neoclassic Object-Oriented Perl book, an inside-out object
creates a series of parallel hashes for the attributes (much like we
had to do back in the Perl4 days before we had hashrefs). For example,
instead of creating a single object for a rectangle that is 3 by 4:
my $r = { width => 3, height => 4 };
we can record its attributes in two separate hashes, keyed by some unique string:
my $r = "some unique string"; $width{$r} = 3; $height{$r} = 4;
Now, to get the height of the rectangle, we use the unique string:
my $width = $width{$r};
and to update the height, we use that same string:
$height{$r} = 10;
When we turn on use strict
, and declare the %width
and
%height
attribute hashes, this will trap any typos related to
attribute names:
use strict; my %width; my %height; ... my $r = "another unique string"; $height{$r} = 7; # ok $widht{$r} = 3; # won't compile!
The typo on the width is now caught, because we don't have a %widht
hash. Hooray. That solves the second problem. But how do we solve
the first problem, and where do we get this ``unique string'', and
how do we get methods on our object?
If I assign a blessed anonymous empty hash to $r:
my $r = bless {}, "Rectangle";
then when the value of $r
is used as a string, I get a nice unique
string:
Rectangle=HASH(0x400180FE)
where the number comes from the hex representation of the internal memory address of the object. As long as this reference is alive, that memory address will not be reused. Aha, there's our unique string:
sub new_7_by_3 { my $self = bless {}, shift; $height{$self} = 7; $width{$self} = 3; return $self; }
And this is what our constructor does! By blessing the object, we'll
return to the same package for methods. By having an anonymous
hashref, we're guaranteed a unique number. And as long as the lexical
%height
and %width
hashes are in scope, we can access and update
the attributes.
But what are we returning? Sure, it's a hashref, but it's empty.
There's no code that we can use to get from $r
to the attribute
hashes:
my $r = Rectangle->new_7_by_3;
The only way we can get the height is to have code in same scope as the definitions of the attribute hashes:
sub height { my $self = shift; return $height{$self}; }
And then we can use that code in our main program:
my $height = $r->height;
The first parameter is $r
, which gets used only for its unique string
value, as a key into the lexical %height
hash! It all Just Works.
Well, for some meaning of Works. We still have a couple of things to fix. First, there's really no reason to make an anonymous hash, because we never put anything into it, so we might as well make it a scalar:
my $self = bless \(my $dummy), shift;
Because Perl doesn't have a primitive anonymous scalar constructor,
I'm cheating by making a $dummy
variable.
Second, we've got some tidying up to do. When a value is no longer being referenced by any variable, we say it goes out of scope. When a traditional hashref based object goes out of scope, any elements of the hash are also discarded, usually causing the values to also go out of scope (unless they are also referenced by some other live value). This all happens quite automatically and efficiently.
However, when our inside-out object goes out of scope, it doesn't
``contain'' anything. However, its address-as-a-string is being used in
one or more attribute hashes, and we need to get rid of those to mimic
the traditional object mechanism. So, we'll need to add a DESTROY
method:
sub DESTROY { my $dead_body = $_[0]; delete $height{$dead_body}; delete $width{$dead_body}; my $super = $dead_body->can("SUPER::DESTROY"); goto &$super if $super; }
Note that after deleting our attributes, we also call any superclass destructor, so that it has a chance to clean up too.
Let's put it all together:
package Rectangle; my %width; my %height; sub new { my $class = shift; my %args = @_; my $self = bless \(my $dummy), $class; $width{$self} = $args{width} || 0; $height{$self} = $args{height} || 0; return $self; } sub DESTROY { my $dead_body = $_[0]; delete $height{$dead_body}; delete $width{$dead_body}; my $super = $dead_body->can("SUPER::DESTROY"); goto &$super if $super; } sub width { my $self = shift; return $width{$self}; } sub set_width { my $self = shift; $width{$self} = shift; } sub height { my $self = shift; return $height{$self}; } sub set_height { my $self = shift; $height{$self} = shift; }
Not bad! Only slightly more complex than a traditional hashref
implementation, and a lot safer for the ``outside''. Of course, this is
a lot of code to get right, so the best thing is to let someone else
do the hard work. See Class::Std
and Object::InsideOut
for some
budding frameworks to build these objects. Until next time, enjoy!