Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.

Unix Review Column 54 (Sep 2004)

[suggested title: ``Strictly speaking about use strict'']

In many of my writings about Perl, I give the strong admonition to place use strict at the beginning of the program. I've often explained the line with a few short phrases, but I thought it would be interesting to focus on this one construct in detail for a change.

The use strict line is a pragma. The purpose of a pragma is to regionally or globally alter the way the language is translated for execution. For the strict pragma, we get three subfeatures enabled or disabled within a particular program scope. The scope extends to the end of the curly-brace-delimited block in which the pragma appears, or to the end of the file if otherwise outside all blocks. Inner pragma controls override outer controls, so we can get as specific as we need to process a particular chunk of code.

The use strict pragma has three aspects: vars, subs, and refs. Each aspect may be enabled or disabled individually by explicit name, but most often, all three are enabled at once with a simple use strict. For example, we can enable all three aspects initially, and disable just the vars aspect for a portion of the code, like so:

  use strict; # all enabled
  ...
  sub marine {
    no strict 'vars'; # disable vars
    ...
  }
  # all enabled again

The vars aspect is probably the most useful of the three aspects, and is the one most likely to give trouble to a beginner. Scalar, array, and hash variables are mapped into package and lexical variables using one of five methods. The vars aspect disables one of these methods, leaving the remaining four enabled.

For example, the variable $bammbamm might be referring to a lexical variable named $bammbamm, introduced earlier in the same scope through the use of the my declaration, as in:

  my $bammbamm = 5;
  ...
  print $bammbamm; # lexical $bammbamm in scope

Or, it might be a package variable declared earlier by use vars in the same package, such as:

  package This::One;
  use vars qw($bammbamm);
  ...
  print $bammbamm; # same as $This::One::bammbamm
  ...
  package That::One;
  # $bammbamm no longer legal here

The variable name might also be declared through the our declaration, which associates a simple name with a package variable in the current package for the remainder of the scope. For example:

  package This::One;
  sub nominal {
    our $bammbamm; # $bammbamm is $This::One::bammbamm
    ...
    package That::One;
    print $bammbamm; # still prints $This::One::bammbamm
  }
  # $bammbamm is no longer permitted

Or, if the name contains a package delimiter (double colon), it's an explicit use of a package variable.

  package This::One;

  print $This::One::bammbamm; # always permitted

Finally, the variable $bammbamm may be just a package variable in the current package, if no prior declaration exists.

  package This::One;
  print $bammbamm; # $This::One::bammbamm;
  package That::One;
  print $bammbamm; # $That::One::bammbamm;

It is this particular method that is disabled by use strict, because it can lead to the most errors in larger programs. By default, any mention of any simple scalar, array, or hash name is simply accepted as a package variable in that package, even if the name is a typo!

By enabling use strict 'vars', the troublesome automatic acceptance of any variable name is prevented, forcing you to declare your variables through one of the other methods. This isn't all that important on a five-line program, but I have rarely seen any program stay at only five lines unless it was a one-off task.

The subs aspect of use strict disables the interpretation of ``bare words'' as text strings. By default, a Perl identifier (a sequence of letters, digits, and underscores, not starting with a digit unless it is completely numeric) that is not otherwise a built-in keyword or previously seen subroutine definition is treated as a quoted text string:

  @daynames = (sun, mon, tue, wed, thu, fri, sat);

However, this is considered to be a dangerous practice, because obscure bugs may result:

  @monthnames = (jan, feb, mar, apr, may, jun,
                 jul, aug, sep, oct, nov, dec);

Can you spot the bug? Yes, the 10th entry is not the string 'oct', but rather an invocation of the built-in oct() function, returning the numeric equivalent of the default $_ treated as an octal number. And if you wrote this program in April, you might not even notice it breaks for six months. I'm not saying that this has happened to anyone I know, because I believe I'm protected from self-incrimination.

Although the problem arises mostly from collisions with built-in words, simply watching for built-ins is insufficient. Suppose we added a sun function earlier in the same scope:

  sub sun { ... }

Now our first dayname is also messed up, being a call to the subroutine instead of the three-character string. But it's not sufficient to simply scan in the source text for a same-named subroutine. The name can also be imported from other code by one of the earlier use directives!

So, the proper method out of this madness is to avoid the use of barewords in most circumstances. This list of day names can be created easily with qw() instead:

  my @daynames = qw(sun mon tue wed thu fri sat);

And now there's no possibility of conflict, because we're using a quoted string instead of a bareword. The nifty part is that use strict 'subs' (included as part of use strict) takes care of enforcing this automatically. Once enabled, barewords will be flagged while the program is being parsed, before execution even begins.

Note that barewords are still permitted in a few specific locations. For example, the key to a hash can always be specified as a bareword:

  my $age = $data{age}; # same as $data{"age"}

Also, the left side of a ``fat arrow'' is also automatically quoted if it resembles a bareword:

  my %data = (age => 19); # same as ("age", 19)

These two automatic quotings make working with hashes with program-significant keys easier, presuming the keys you choose are all barewords.

Finally, a predeclared subroutine can be treated as a subroutine call, even if the definition of the subroutine had not yet been seen:

  sub deeper; # declaration
  ...
  my $result = deeper;

I don't recommend this practice, since it is just as easy (and clearer) to follow the subroutine call with empty parens:

  my $result = deeper(); # no declaration needed

The final aspect of the use strict pragma is the disabling of soft references (or symbolic references). A normal reference (sometimes called a hard reference to distinguish them from soft references) comes from an explicit referencing operation:

  my $ref = \@foo; # now $ref is a reference to @foo

or from one of the anonymous reference constructors:

  my $ref2 = [3, 4, 5]; # array reference created

An autovivification will also create a hard reference:

  my $ref3; # variable is undef initially
  $ref3->[5] = 10; # $ref3 is now an array reference

Following this reference using a dereferencing operation gets us back to the original data:

  print $ref2->[2]; # prints 5, from the anon array

However, the dereferencing operation can also be performed against a simple scalar string:

  my $sref = "happy";
  $sref->[3] = "hello"; # symbolic reference

This dereferencing is performed at execution time. Perl looks up the value to be dereferenced, notes that it is not a hard reference, and then examins the package variable symbol table for a same-named variable. Because package variables spring into existence as needed, nearly any name in $sref will be considered legal, causing new variables to be created dynamically.

As if that wasn't already scary enough, the variable name does not need to be a standard Perl identifier. Any string will do:

  my $sref = "A [variable] {name} !normally! *illegal*";
  $$sref = 12;

We now have a scalar package variable in the current package with a very crazy name.

Because of the likelihood of an accidental symbolic dereference operation, the use strict 'refs' aspect is recommended for every program that uses references.

If all three of these restrictions are good, why are they not enabled by default? The answer is ``backward compatibility''. Perl version 4 (last updated over a decade ago) permitted casual variable naming (and didn't have any option for lexically declared variables), didn't have the convenient qw() for defining lists of short values, and used soft references for indirect subroutine invocation. Thus, adding use strict by default would have broken nearly every Perl version 4 program!

But Perl4 is now long dead. Be sure to use strict in your modern Perl5 programs, and you'll get a guaranteed reduction in development time, or double your money back! Until next time, enjoy!

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.

Worldwide training and consulting by Perl experts

Copyright Notice

Unix Review Column 54 (Sep 2004)