Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Unix Review Column 52 (May 2004)
[suggested title: Constructing Objects]
To construct an object in Perl, you need to select a valid package
name for the object's class, populate that package with subroutines to
define the methods, set the value of @ISA
within that package to
define the base (parent) classes for that class, and then create a
blessed reference.
For example, we can make widgets that know how to say their name and
take on a new name with the following code (I'll describe $self
in
a moment here):
package Widget;
sub display { my $self = shift; print $self->{name}, "\n"; }
sub rename { my $self = shift; $self->{name} = shift; $self; }
A constructed object compatible with this definition has to be a hashref
with at least a key of name
holding the name of the object. We
can construct such a hashref like so:
my $dog = { name => 'Spot' }; bless $dog, 'Widget';
The bless
operation puts a little post-it note on the hash data
structure (not the reference!) that says ``I belong to Widget''. Now,
we can invoke the methods like so:
$dog->display; # prints "Spot\n" $dog->rename("Fido"); $dog->display; # prints "Fido\n"
How does this work? To execute the rename
call, for example, Perl
constructs an argument list consisting of the object variable
($dog
) plus any arguments given to the method, resulting in:
($dog, "Fido")
Next, Perl looks for a subroutine in the package given by the post-it
note (the object's class) named the same as the method. The
subroutine Widget::rename
gets invoked, and the first argument ends
up in $self
. The second argument gets assigned as an element of
the hash, and finally the subroutine returns $self
(not a
requirement, but handy for other operations).
Normally, we wouldn't hand-construct the object. The lines to create the hash and bless the object will be found in a constructor within the class. We'll invoke the constructor as:
my $dog = Widget->named("Spot");
To execute this class method invocation, Perl again constructs an argument list, but this time puts the name of the package as the first element:
("Widget", "Spot")
And upon finding the Widget::named
subroutine, invokes it:
package Widget;
sub named { my $class = shift; # gets Widget (usually) my $self = { name => shift }; bless $self, $class; $self; }
Comparing this code to the code above, we see that we'll be returning
$self
, which is just like the $dog
from before. (One common
optimization is to know that bless
also returns $self
in this
case, so we can leave that last line out with no change in result.)
For a more detailed explanation of this process, please see my most recent tutorial book, Learning Perl Objects, References, and Modules, from O'Reilly and Associates.
Now, why didn't we just hardcode the Widget
value into the
bless
, and what's up with that ``usually'' in the comment? The
complication arises when we get to inheritance. Suppose we have
a subclass called ColoredWidget
that inherits from Widget
and
adds two methods to manage the color of the widget:
package ColoredWidget; use base qw(Widget); # sets @ISA
sub color { my $self = shift; $self->{color}; }
sub recolor { my $self = shift; $self->{color} = shift; $self; }
Calling color
or recolor
on a ColoredWidget
uses the
subroutines found in the ColoredWidget
package, but calling
named
on ColoredWidget
uses the @ISA
to find the named
routine from the base class, Widget
. In this case, the argument
list will look like:
("ColoredWidget", "Spot")
Because the first argument to named
is shifted off into $class
,
and then used in the bless
, we get an object of class
ColoredWidget
instead of Widget
.
Our display
method for ColoredWidget
needs a bit more work now
though, if we want the color as well. We can use overriding
to handle that:
package ColoredWidget;
sub display { my $self = shift; print $self->name, ", colored ", $self->color, "\n"; }
Now, for ColoredWidget
objects, this version of display
is used in preference to the previous version. We can also extend
rather than override by reusing the base class version of display
:
package ColoredWidget;
sub display { my $self = shift; $self->SUPER::display; print "[color: ", $self->color, "]\n"; }
Now when we invoke display
on a ColoredWidget
, we first invoke
the first display
found in the base class (as if there was no
definition in this class). That invocation produces the name by
itself. Then control returns to this method, and we add the color in
brackets below.
The constructor here is named named
because it reads like what
it does: give me a Widget named Spot. But for tradition's sake,
I could also call the contructor new
. In fact, I might make
a contructor new
that returns an unnamed Widget (the name left
as undef
if referenced). This'd look like:
package Widget;
sub new { my $class = shift; bless {}, $class; }
Here, a simple reference to an empty hash is generated, blessed into the right class, and returned. To make Spot, I can now say:
my $dog = Widget->new; $dog->rename("Spot");
That's a little more clumsy, but at least it gets the job done.
Another advantage to always naming your constructor as new
is that
you can easily create an object that is ``like'' another object. For
example, if we have an unknown object $object
, we can call
ref $object
to get its class, then create another object of the same
class by calling new
:
my $similar = (ref $object)->new;
But this works only if all of our possible classes of $object
understand the same new
method. Fortunately, for the times we're
likely to do this, we've also made the classes work this way.
Another common operation is cloning: making an object that is a copy of the current object. It's not enough to simply copy the reference:
my $puppy = $dog;
This action copies the reference to the data, but not the data itself.
So if I rename the $puppy
, the $dog
changes its name as well!
Cloning is best handled by copying all of the data. A naive clone
could be executed as:
package Widget;
sub clone { my $self = shift; my $clone = { %$self }; # copy keys/values one level deep bless $clone, ref $self; # copy the object class, returning $clone }
This will work properly for objects that do not have a deep structure, such as we've seen here so far. But what if one of the object attributes is a reference to yet another data structure. Again, we're copying the reference, and not the data, so the data will be shared amongst the clones. [See this column from February 2000 for more details on deep copying.]
An alternative method of cloning is a more piecemeal approach. Teach each class in the hierachy to clone the attributes added by that class.
package Widget;
sub clone { my $self = shift; my $clone = (ref $self)->new; # empty object of same class $clone->rename($self->name); # copy name attribute $clone; }
package SubWidget;
sub clone { my $self = shift; my $clone = $self->SUPER::clone; # clone base class stuff $clone->recolor($self->color); # copy color attribute $clone; }
Note the similarity of design. If a class knows that it adds a complex attribute (a reference to a deeper data structure), then it can add special copying instructions for that attribute to the new clone. This is a good OO design, because the information contained within each base and derived class is maintained closely with the logic it depends on.
And now, before I run out of space, let me touch on a hot-button for
me. The perltoot manpage contains an archetypal new
routine
that looks like:
sub new { my $proto = shift; my $class = ref($proto) || $proto; my $self = {}; ... }
The purpose of these few lines of extra code is to permit:
my $other = $dog->new;
to act like
my $other = (ref $dog)->new;
But here's the problem. When I survey experienced object-oriented
programmers, and ask them what they expect new
means when called on
an instance (without looking at the implementation), the result
usually divides rather equally into three camps: those that go ``huh,
why would you do that'' and think it should throw an error, those that
say that it would clone the object, and those that say it would
copy the object's class but not the contents.
So, no matter what you intend if you make your new
do one of those
three things, two thirds of the people who look at it will be wrong.
It's not intuitive. So, don't write code like that, and especially
don't just cargo-cult that from the manpage into your code. If you
want an object like another object, use ref
explicitly, as shown
above. If you want a clone, put cloning code into your package, and
call clone
, as we saw earlier.
Hopefully, you've learned at least one or two things about objects that you might not have considered before. Until next time, enjoy!