Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.

Unix Review Column 32 (Jun 2000)

[suggested title: Getting it to look the way you want]

For the most part, Perl programmers tend to use the nice standby print operator for output, or drift occasionally into the realm of formats to quickly lay out a look for a customized report. However, the often overlooked printf operator provides a nice amount of custom control to get those output strings to look exactly the way you want.

The printf operator takes a format string, and zero or more values. The format string drives the whole process. With a few exceptions, each percent % field in the format string matches up with one of the additional values, defining how the value will appear in the output. For example:

  printf "my string %s has %d characters.\n", $str, length($str);

Here, the %s field calls for a string value, provided by $str. Similarly, the %d field calls for a decimal integer, provided by the length($str) computation. The parameters are evaluated in a list context, so we could have also used the following code to accomplish the same output:

  @output = ($str, length($str));
  printf "my string %s has %d characters.\n", @output;

This gets interesting if we don't know the length of @output, because we need a %-field for each element of @output, but since we set it up ourselves here, that's not a problem.

Besides %s for string, and %d for decimal integer, another common format is %f for floating point:

  printf "he has $%f in his account\n", 3.50;

Here, the value 3.5 is printed as a floating point number 3.500000. But why all the extra zeroes? Well, the default precision for floating point output appears to be six places after the decimal point. To narrow that down, we can add a precision control between the % and the f in the format:

  printf "he has $%.2f in his account\n", 3.50;

And that'll generate 3.50 as we expect. The 2 here means two digits after the decimal point, and that makes cents, er, uh, sense. The value is rounded to fit, so 3.509 would show up as 3.51, whereas 3.502 shows up as 3.50. As an extreme, we can use %.0f to round to the nearest whole number, and no decimal point will be used.

Another common format is scientific notation, %e. This is handy when the number would be normally too large to represent in a few digits:

  printf "2 to the 100 power is approximately %e\n", 2 ** 100;

And this shows up as 1.267651e+30, again defaulting to 6 digits after the decimal point unless we use a precision control like %.10e.

But %e is used rarely (that I've seen). Generally, when a number of unknown magnitude or precision is displayed, most programmers fall back to the %g ``general number'' format. In this case, the number is formatted using whatever of %d, %f, or %e gives ``better'' results. If it's a nice integer, we get a nice integer format. If it's a reasonably-sized floating point number, that format is used, and otherwise, we fall back on to the scientific notation.

  printf "Your number is %g\n", $number;

The precision can again be used, but in this case specifies the maximum number of significant digits, defaulting to 6 again. So, for %.15g, we get the best possible display for the 15 most significant digits.

For strings, we get a similar ``precision'' control. If we include a precision on a string, and the string is longer, it's automatically truncated:

  printf "I said %.5s!\n", "hello world";

which prints I said hello!, truncating the string.

Another feature of printf is the field width padding. After the value for a particular field is determined, a specified minimum width can be respected, indicated by decimal integer after the percent:

  printf "=%10s=\n", "hello";

Here, the six-character string is not a full ten characters, so four spaces are added on the left. This is a minimum width, not a maximum, so if the string were longer, it'd still be included in its entirely. We can combine the precision field with the width field to get a string that is space-padded up to a size, or truncated if it exceeds the size. Take the sample code:

  printf "=%5.5s=\n", substr("1234567890", 0, $_) for 0..10;

which displays a nice pattern of:

The space padding can appear on the right instead of the left by using a negative number for the minimum width:

  printf "=%-5.5s=\n", substr("1234567890", 0, $_) for 0..10;

Additionally, numbers can be zero padded rather than space padded, using a leading 0 in front of the width:

  printf "%02d:%02d:%02d %s", $h, $m, $s, $ampm;

If the number for $m is less than 10 (like 7), we get a leading 0 in the output (as in 07), very handy for time displays like this one.

A literal % can be obtained by doubling it up, as in:

  printf "He scored %.0f%% of the goals", 100 * $him / $total;

Note that the often-attempted backslashing of % won't do. This isn't a string-interpolation escaping problem: it's a printf-interpretation problem.

One of the less-frequently used formats is the ``character'' format:

  printf "the letter A is %c\n", 65;

Here, the value of 65 is treated as an ASCII code, and turned into the uppercase A. It's not as frequently used in Perl as in C, because Perl deals with strings as first-class datatypes, rarely exposing the numeric values of the individual characters to the programmer.

And then there's the ``programmer-type'' formats... %h for hexadecimal, %o for octal, and new in Perl 5.6, %b for binary. For example, here's one way to look at the permission bits of a file:

        printf "%s is mode %o\n", $_, 07777 & (stat)[2] for @ARGV;

But looking at the output, the values juggle all around. Ahh, time to use the minimum field width:

        printf "%30s is mode %04o\n", $_, 07777 & (stat)[2] for @ARGV;

So, that's the basics, but let's look at some practical code as well. Suppose I have a series of values in @numbers that I want to print in a vertical column, all with a format of %15g. You might think that I could simply do this:

        printf "%15g\n", @numbers; # bad

but this won't work, because there needs to be a %-field for each value used from the list (as we saw earlier). Well, a simple way to fix that is to use a loop:

        printf "%15g\n", $_ for @numbers;

But another way to do this is to replicate the format string. If we need to print 3 entries, we need a string like %15g\n%15g\n%15g\n, which we can get with "%15g\n" x 3. So, we need the number of elements in @numbers on the right of that x. Easy enough: just use the arrayname in a scalar context (which it is!):

        printf "%15g\n" x @numbers, @numbers;

Here, @numbers is used in both a scalar context and a list context in the same expression: same text, different meaning. Just like when you wind up with no wind for your kite.

Occasionally, you may need to have a variable width for a column. Let's say you needed that 15 from the previous example to be configurable:

        $width = 15;
        printf "%$widthg\n", $_ for @numbers; # bad

This won't work, because Perl is looking for a variable named $widthg, even though you intended that as $width followed by g. But you also can't put a space in there, because the printf format is picky and can't understand a space. One solution is to delimit the variable name:

        $width = 15;
        printf "%${width}g\n", $_ for @numbers;

Another is to use the * indirection in the list to define the number. Each * in a format field calls for an element to be used from the values for the numeric value that the * stands in for:

        $width = 15;
        printf "%*g\n", $width, $_ for @numbers;
        
And there you have it.  Many ways to print your numbers, strings, and
anything else you come up with in your evaluations.  Until next time,
enjoy!

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.

Worldwide training and consulting by Perl experts

Copyright Notice

Unix Review Column 32 (Jun 2000)