Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.

Unix Review Column 52 (May 2004)

[suggested title: Constructing Objects]

To construct an object in Perl, you need to select a valid package name for the object's class, populate that package with subroutines to define the methods, set the value of @ISA within that package to define the base (parent) classes for that class, and then create a blessed reference.

For example, we can make widgets that know how to say their name and take on a new name with the following code (I'll describe $self in a moment here):

  package Widget;

  sub display {
    my $self = shift;
    print $self->{name}, "\n";
  }

  sub rename {
    my $self = shift;
    $self->{name} = shift;
    $self;
  }

A constructed object compatible with this definition has to be a hashref with at least a key of name holding the name of the object. We can construct such a hashref like so:

  my $dog = { name => 'Spot' };
  bless $dog, 'Widget';

The bless operation puts a little post-it note on the hash data structure (not the reference!) that says ``I belong to Widget''. Now, we can invoke the methods like so:

  $dog->display; # prints "Spot\n"
  $dog->rename("Fido");
  $dog->display; # prints "Fido\n"

How does this work? To execute the rename call, for example, Perl constructs an argument list consisting of the object variable ($dog) plus any arguments given to the method, resulting in:

  ($dog, "Fido")

Next, Perl looks for a subroutine in the package given by the post-it note (the object's class) named the same as the method. The subroutine Widget::rename gets invoked, and the first argument ends up in $self. The second argument gets assigned as an element of the hash, and finally the subroutine returns $self (not a requirement, but handy for other operations).

Normally, we wouldn't hand-construct the object. The lines to create the hash and bless the object will be found in a constructor within the class. We'll invoke the constructor as:

  my $dog = Widget->named("Spot");

To execute this class method invocation, Perl again constructs an argument list, but this time puts the name of the package as the first element:

  ("Widget", "Spot")

And upon finding the Widget::named subroutine, invokes it:

  package Widget;

  sub named {
    my $class = shift; # gets Widget (usually)
    my $self = { name => shift };
    bless $self, $class;
    $self;
  }

Comparing this code to the code above, we see that we'll be returning $self, which is just like the $dog from before. (One common optimization is to know that bless also returns $self in this case, so we can leave that last line out with no change in result.)

For a more detailed explanation of this process, please see my most recent tutorial book, Learning Perl Objects, References, and Modules, from O'Reilly and Associates.

Now, why didn't we just hardcode the Widget value into the bless, and what's up with that ``usually'' in the comment? The complication arises when we get to inheritance. Suppose we have a subclass called ColoredWidget that inherits from Widget and adds two methods to manage the color of the widget:

  package ColoredWidget;
  use base qw(Widget); # sets @ISA

  sub color {
    my $self = shift;
    $self->{color};
  }

  sub recolor {
    my $self = shift;
    $self->{color} = shift;
    $self;
  }

Calling color or recolor on a ColoredWidget uses the subroutines found in the ColoredWidget package, but calling named on ColoredWidget uses the @ISA to find the named routine from the base class, Widget. In this case, the argument list will look like:

  ("ColoredWidget", "Spot")

Because the first argument to named is shifted off into $class, and then used in the bless, we get an object of class ColoredWidget instead of Widget.

Our display method for ColoredWidget needs a bit more work now though, if we want the color as well. We can use overriding to handle that:

  package ColoredWidget;

  sub display {
    my $self = shift;
    print $self->name, ", colored ", $self->color, "\n";
  }

Now, for ColoredWidget objects, this version of display is used in preference to the previous version. We can also extend rather than override by reusing the base class version of display:

  package ColoredWidget;

  sub display {
    my $self = shift;
    $self->SUPER::display;
    print "[color: ", $self->color, "]\n";
  }

Now when we invoke display on a ColoredWidget, we first invoke the first display found in the base class (as if there was no definition in this class). That invocation produces the name by itself. Then control returns to this method, and we add the color in brackets below.

The constructor here is named named because it reads like what it does: give me a Widget named Spot. But for tradition's sake, I could also call the contructor new. In fact, I might make a contructor new that returns an unnamed Widget (the name left as undef if referenced). This'd look like:

  package Widget;

  sub new {
    my $class = shift;
    bless {}, $class;
  }

Here, a simple reference to an empty hash is generated, blessed into the right class, and returned. To make Spot, I can now say:

  my $dog = Widget->new;
  $dog->rename("Spot");

That's a little more clumsy, but at least it gets the job done.

Another advantage to always naming your constructor as new is that you can easily create an object that is ``like'' another object. For example, if we have an unknown object $object, we can call ref $object to get its class, then create another object of the same class by calling new:

  my $similar = (ref $object)->new;

But this works only if all of our possible classes of $object understand the same new method. Fortunately, for the times we're likely to do this, we've also made the classes work this way.

Another common operation is cloning: making an object that is a copy of the current object. It's not enough to simply copy the reference:

  my $puppy = $dog;

This action copies the reference to the data, but not the data itself. So if I rename the $puppy, the $dog changes its name as well! Cloning is best handled by copying all of the data. A naive clone could be executed as:

  package Widget;

  sub clone {
    my $self = shift;
    my $clone = { %$self }; # copy keys/values one level deep
    bless $clone, ref $self; # copy the object class, returning $clone
  }

This will work properly for objects that do not have a deep structure, such as we've seen here so far. But what if one of the object attributes is a reference to yet another data structure. Again, we're copying the reference, and not the data, so the data will be shared amongst the clones. [See this column from February 2000 for more details on deep copying.]

An alternative method of cloning is a more piecemeal approach. Teach each class in the hierachy to clone the attributes added by that class.

  package Widget;

  sub clone {
    my $self = shift;
    my $clone = (ref $self)->new; # empty object of same class
    $clone->rename($self->name); # copy name attribute
    $clone;
  }

  package SubWidget;

  sub clone {
    my $self = shift;
    my $clone = $self->SUPER::clone; # clone base class stuff
    $clone->recolor($self->color); # copy color attribute
    $clone;
  }

Note the similarity of design. If a class knows that it adds a complex attribute (a reference to a deeper data structure), then it can add special copying instructions for that attribute to the new clone. This is a good OO design, because the information contained within each base and derived class is maintained closely with the logic it depends on.

And now, before I run out of space, let me touch on a hot-button for me. The perltoot manpage contains an archetypal new routine that looks like:

  sub new {
    my $proto = shift;
    my $class = ref($proto) || $proto;
    my $self  = {};
    ...
  }

The purpose of these few lines of extra code is to permit:

  my $other = $dog->new;

to act like

  my $other = (ref $dog)->new;

But here's the problem. When I survey experienced object-oriented programmers, and ask them what they expect new means when called on an instance (without looking at the implementation), the result usually divides rather equally into three camps: those that go ``huh, why would you do that'' and think it should throw an error, those that say that it would clone the object, and those that say it would copy the object's class but not the contents.

So, no matter what you intend if you make your new do one of those three things, two thirds of the people who look at it will be wrong. It's not intuitive. So, don't write code like that, and especially don't just cargo-cult that from the manpage into your code. If you want an object like another object, use ref explicitly, as shown above. If you want a clone, put cloning code into your package, and call clone, as we saw earlier.

Hopefully, you've learned at least one or two things about objects that you might not have considered before. Until next time, enjoy!

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.

Worldwide training and consulting by Perl experts

Copyright Notice

Unix Review Column 52 (May 2004)