Copyright Notice

This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.

Linux Magazine Column 11 (Apr 2000)

[suggested title: Introductions to Objects]

In the past three columns, I looked at using ``references'' in Perl. References are an important part of capturing and reflecting the structure of real-world data -- a table of employees, each of which has various attributes, can be represented as an array of hashrefs, pointing at attribute hashes for each employee.

Now, let's turn to capturing and reflecting the real-world processes, in the form of ``objects''. Objects provide encapsulation (to control access to data), abstract data types (to let the data more closely model the real world), and inheritance (to reuse operations that are similar but have some variation).

The Perl distribution includes perlobj, a basic reference in using objects, and perltoot, which introduces readers to the pecularities of Perl's object system in a tutorial way. However, I found that both of these documentation sections tend to be opaque to those of us with less experience with objects. And that seems to be the majority of users coming from the system administration or CGI web development background (Perl's core audience).

So I created some courseware for Stonehenge's Perl training classes that took a different approach to objects, presuming no prior exposure to objects. It goes something like this...

If we could talk to the animals...

Let's let the animals talk for a moment:

    sub Cow::speak {
      print "a Cow goes moooo!\n";
    }
    sub Horse::speak {
      print "a Horse goes neigh!\n";
    }
    sub Sheep::speak {
      print "a Sheep goes baaaah!\n"
    }

    Cow::speak;
    Horse::speak;
    Sheep::speak;

This results in:

    a Cow goes moooo!
    a Horse goes neigh!
    a Sheep goes baaaah!

Nothing spectacular here. Simple subroutines, albeit from separate packages, and called using the full package name. So let's create an entire pasture:

    # Cow::speak, Horse::speak, Sheep::speak as before
    @pasture = qw(Cow Cow Horse Sheep Sheep);
    foreach $animal (@pasture) {
      &{$animal."::speak"};
    }

This results in:

    a Cow goes moooo!
    a Cow goes moooo!
    a Horse goes neigh!
    a Sheep goes baaaah!
    a Sheep goes baaaah!

Wow. That symbolic coderef de-referencing there is pretty nasty. We're counting on no strict subs mode, certainly not recommended for larger programs. And why was that necessary? Because the name of the package seems to be inseparable from the name of the subroutine we want to invoke within that package.

Or is it?

Introducing the method invocation arrow

For now, let's say that Class->method invokes subroutine method in package Class. That's not completely accurate, but we'll do this one step at a time. Now let's use it like so:

    # Cow::speak, Horse::speak, Sheep::speak as before
    Cow->speak;
    Horse->speak;
    Sheep->speak;

And once again, this results in:

    a Cow goes moooo!
    a Horse goes neigh!
    a Sheep goes baaaah!

That's not fun yet. Same number of characters, all constant, no variables. But yet, the parts are separable now. Watch:

    $a = "Cow";
    $a->speak; # invokes Cow->speak

Ahh! Now that the package name has been parted from the subroutine name, we can use a variable package name. And this time, we've got something that works even when use strict refs is enabled.

Invoking a barnyard

Let's take that new arrow invocation and put it back in the barnyard example:

    sub Cow::speak {
      print "a Cow goes moooo!\n";
    }
    sub Horse::speak {
      print "a Horse goes neigh!\n";
    }
    sub Sheep::speak {
      print "a Sheep goes baaaah!\n"
    }

    @pasture = qw(Cow Cow Horse Sheep Sheep);
    foreach $animal (@pasture) {
      $animal->speak;
    }

There! Now we have the animals all talking, and safely at that, without the use of symbolic coderefs.

But look at all that common code. Each of the speak routines has a similar structure: a print operator and a string that contains common text, except for two of the words. It'd be nice if we could factor out the commonality, in case we decide later to change it all to says instead of goes.

And we actually have a way of doing that without much fuss, but we have to hear a bit more about what the method invocation arrow is actually doing for us.

The extra parameter of method invocation

The invocation of:

    Class->method(@args)

attempts to invoke subroutine Class::method as:

    Class::method("Class", @args);

(If the subroutine can't be found, ``inheritance'' kicks in, but we'll get to that later.) This means that we get the class name as the first parameter. So we can rewrite the Sheep speaking subroutine as:

    sub Sheep::speak {
      my $class = shift;
      print "a $class goes baaaah!\n";
    }

And the other two animals come out similarly:

    sub Cow::speak {
      my $class = shift;
      print "a $class goes moooo!\n";
    }
    sub Horse::speak {
      my $class = shift;
      print "a $class goes neigh!\n";
    }

In each case, $class will get the value appropriate for that subroutine. But once again, we have a lot of similar structure. Can we factor that out even further? Yes, by calling another method in the same class.

Calling a second method to simplify things

Let's call out from speak to a helper method called sound. This method provides the constant text for the sound itself.

    { package Cow;
      sub sound { "moooo" }
      sub speak {
        my $class = shift;
        print "a $class goes ", $class->sound, "!\n"
      }
    }

Now, when we call Cow->speak, we get a $class of Cow in speak. This in turn selects the Cow->sound method, which returns moooo. But how different would this be for the Horse?

    { package Horse;
      sub sound { "neigh" }
      sub speak {
        my $class = shift;
        print "a $class goes ", $class->sound, "!\n"
      }
    }

Only the name of the package and the specific sound change. So can we somehow share the definition for speak between the Cow and the Horse? Yes, with inheritance!

Inheriting the windpipes

We'll define a common subroutine package called Animal, with the definition for speak:

    { package Animal;
      sub speak {
        my $class = shift;
        print "a $class goes ", $class->sound, "!\n"
      }
    }

Then, for each animal, we say it ``inherits'' from Animal, along with the animal-specific sound:

    { package Cow;
      @ISA = qw(Animal);
      sub sound { "moooo" }
    }

Note the added @ISA array. We'll get to that in a minute.

But what happens when we invoke Cow->speak now?

First, Perl constructs the argument list. In this case, it's just Cow. Then Perl looks for Cow::speak. But that's not there, so Perl checks for the inheritance array @Cow::ISA. It's there, and contains the single name Animal.

Perl next checks for speak inside Animal instead, as in Animal::speak. And that's found, so Perl invokes that subroutine with the already frozen argument list.

Inside the Animal::speak subroutine, $class becomes Cow (the first argument). So when we get to the step of invoking $class->sound, it'll be looking for Cow->sound, which gets it on the first try without looking at @ISA. Success!

A few notes about @ISA

This magical @ISA variable (pronounced ``is a'' not ``ice-uh''), has declared that Cow ``is a'' Animal. Note that it's an array, not a simple single value, because on rare occasions, it makes sense to have more than one parent class searched for the missing methods.

If Animal also had an @ISA, then we'd check there too. The search is recursive, depth-first, left-to-right in each @ISA.

When we turn on use strict, we'll get complaints on @ISA, since it's not a variable containing an explicit package name, nor is it a lexical (``my'') variable. We can't make it a lexical variable though, so there's a couple of straightforward ways to handle that.

The easiest is to just spell the package name out:

    @Cow::ISA = qw(Animal);

Or allow it as an implictly named package variable:

    package Cow;
    use vars qw(@ISA);
    @ISA = qw(Animal);

If you're bringing in the class from outside, via an object-oriented module, you change:

    package Cow;
    use Animal;
    use vars qw(@ISA);
    @ISA = qw(Animal);

into just:

    package Cow;
    use base qw(Animal);

And that's pretty darn compact.

Overriding the methods

Let's add a mouse, which can barely be heard:

    # Animal package from before
    { package Mouse;
      @ISA = qw(Animal);
      sub sound { "squeak" }
      sub speak {
        my $class = shift;
        print "a $class goes ", $class->sound, "!\n";
        print "[but you can barely hear it!]\n";
      }
    }

    Mouse->speak;

which results in:

    a Mouse goes squeak!
    [but you can barely hear it!]

Here, Mouse has its own speaking routine, so Mouse->speak doesn't immediately invoke Animal->speak. This is known as ``overriding''. In fact, we didn't even need to say that a Mouse was an Animal at all, since all of the methods needed for speak are completely defined with Mouse.

But we've now duplicated some of the code from Animal->speak, and this can once again be a maintenance headache. So, can we avoid that? Can we say somehow that a Mouse does everything any other Animal does, but add in the extra comment? Sure!

First, we can invoke the Animal::speak method directly:

    # Animal package from before
    { package Mouse;
      @ISA = qw(Animal);
      sub sound { "squeak" }
      sub speak {
        my $class = shift;
        Animal::speak($class);
        print "[but you can barely hear it!]\n";
      }
    }

Note that we have to include the $class parameter (almost surely the value of "Mouse") as the first parameter to Animal::speak, since we've stopped using the method arrow. Why did we stop? Well, if we invoke Animal->speak there, the first parameter to the method will be "Animal" not "Mouse", and when time comes for it to call for the sound, it won't have the right class to come back to this package.

Invoking Animal::speak directly is a mess, however. What if Animal::speak didn't exist before, and was being inherited from a class mentioned in @Animal::ISA? Because we are no longer using the method arrow, we get one and only one chance to hit the right subroutine.

Also note that the Animal classname is now hardwired into the subroutine selection. This is a mess if someone maintains the code, changing @ISA for Mouse and didn't notice Animal there in speak. So, this is probably not the right way to go.

Starting the search from a different place

A better solution is to tell Perl to search from a higher place in the inheritance chain:

    # same Animal as before
    { package Mouse;
      @ISA = qw(Animal);
      sub sound { "squeak" }
      sub speak {
        my $class = shift;
        $class->Animal::speak;
        print "[but you can barely hear it!]\n";
      }
    }

Ahh. This works. Using this syntax, we start with Animal to find speak, and use all of Animal's inheritance chain if not found immediately. And yet the first parameter will be $class, so the found speak method will get Mouse as its first entry, and eventually work its way back to Mouse::sound for the details.

But this isn't the best solution. We still have to keep the @ISA and the initial search package coordinated. Worse, if Mouse had multiple entries in @ISA, we wouldn't necessarily know which one had actually defined speak. So, is there an even better way?

The SUPER way of doing things

By changing the Animal class to the SUPER class in that invocation, we get a search of all of our super classes automatically:

    # same Animal as before
    { package Mouse;
      @ISA = qw(Animal);
      sub sound { "squeak" }
      sub speak {
        my $class = shift;
        $class->SUPER::speak;
        print "[but you can barely hear it!]\n";
      }
    }

So, SUPER::speak means look in the current package's @ISA for speak, invoking the first one found.

In summary

So far, I've introduced a method arrow syntax:

  Class->method(@args);

or the equivalent:

  $a = "Class";
  $a->method(@args);

which constructs an argument list of:

  ("Class", @args)

and attempts to invoke

  Class::method("Class", @Args);

However, if Class::method is not found, then @Class::ISA is examined (recursively) to locate a package that does indeed contain method, and that subroutine is invoked instead.

Using this simple syntax, we have class methods, (multiple) inheritance, overriding, and extending. Using just what we've seen so far, we've been able to factor out common code, and provide a nice way to reuse implementations with variations. This is at the core of what objects provide, but objects also provide instance data, which we haven't even begun to cover.

But I've run out of space for this time, so see Part Two next month. Until then, enjoy.

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.

Worldwide training and consulting by Perl experts

Copyright Notice