Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.

Unix Review Column 20 (June 1998)

With many ways to manipulate strings, Perl is a good ``text wrangling'' language. Perl makes it easy to read in strings of arbitrary length, select and extract interesting data, and write the results to files, sockets, or other processes.

One interesting problem is the ability to have text contain arbitrary expressions. This is really handy when you have a template file (say, a report or an HTML page) that stays mostly constant, but should have some variable or freshly computed parts. Perl normally doesn't recognize such expressions within a string as anything other than just some additional characters to print, but there are circumstances where the text is changed.

For example,

        $a = 3 + 4;
        print "I have $a eggs\n";

allows me to compute the expression of 3 + 4, and then insert the result into the string. However, putting the same expression into the string doesn't work:

        print "I have 3 + 4 eggs\n";

because Perl cannot tell whether this is the text of 3 + 4, or an expression to be calculated. A while back, I came up with a trick to get an expression evaluated inside a double-quoted string, and it became the easiest way to handle the problem of getting expressions within strings. It looks a little ugly, but so does the rest of Perl, so by comparison, it's not half bad.

The trick is to simply precede the expression with @{[ and follow it with ]}, like so:

        print "I have @{[ 3 + 4 ]} eggs\n";

If you execute this code, you'll see that it correctly prints 7 eggs! How is this working? Well, the outer @{ ... } triggers an array interpolation, requiring either an array name or a list reference inside the braces. The inner square brackets create an anonymous list, and return the reference to that list. This anonymous list has to be computed from the list of expressions within the brackets -- in this case, there's only one, so it's a single element list.

Thus, the expression is computed, turned into an anonymous list, then interpolated by the @ trigger, and we're done!

We can even make use of this construct in larger documents (using here-strings):

        open SM, "|/usr/lib/sendmail -t";
        print SM <<END;
        To: $destination
        From: @{[$source || "root"]}
        Subject: update at @{[scalar localtime]}

        The following people are logged
        on to @{[`hostname` =~ /(.*)/]}:

        @{
          my %foo = map /^(\S+)()/, `who`;
          [sort keys %foo];
        }
        END
        close SM;

There's a lot of meat here... let's go through it a step at a time. First, I'm opening a pipe to sendmail, to send a mail message. Next, I'm printing a double-quoted here-string to that pipe. The $destination variable is an ordinary scalar variable that I've set somewhere before this code.

The from line of the message uses the construct described above. If the $source variable is set, it's used -- otherwise, the constant root is returned.

The subject line of the message also uses the construct described above. The localtime operator in a scalar context returns a nice timestamp. Because the square-bracket anonymous list constructor wants to evaluate the elements in a list context, I have to force scalar context with the scalar operator. The resulting expression is squished into the subject line with relative ease.

Similarly, the current hostname is computed and inserted. Note that I'm taking the output of the backquoted hostname command, and matching it against a regular expression that extracts all the characters before the newline. That way, the newline is not extracted, and I can use it as text in the middle of the line.

The final chunk of code within this string uses an extra trick. The @{...} construct is really any block of code, as long as the last expression evaluated in that block is a listref of some kind. So, to get a unique list of users on the system, I can use the keys of a temporary hash as a set. The output of the who command is broken into lines, and matched line by line with the regular expression, generating two elements of a total list for each original line. This is the right shape of a result to create the hash. Finally, the keys of the hash are sorted, and turned into an anonymous list.

Another way of having a ``mostly constant, but sometimes changing'' text string is to perform a global substitute on the string. While we can't get arbitrary expressions, it works well when the data comes from a data structure, like a hash:

        %data = (
                TO => 'fred@random.place',
                PRIZE => 'a pearl necklace',
                VALUE => '1000',
        );
        $_ = <<'EOF';
        To: %TO%
        From: merlyn@stonehenge.com
        Subject: Your lucky day

        You are the winner of %PRIZE%,
        worth over $%VALUE%!  Congratulations.
        EOF
        s/%(\w+)%/$data{$1}/g;
        print;

For each of the words found between percent signs, the corresponding hash element is looked up by key, and replaced with its value. This is good for those form-letter type problems. If the data cannot be stored in a hash like this, we could go a step further and make the replacement text a full expression, instead of a simple double-quoted string, using the /e modifier on the substitution.

        $_ = <<'EOF';
        To: %TO%
        From: merlyn@stonehenge.com
        Subject: Your lucky day

        You are the winner of %PRIZE%,
        worth over $%VALUE%!  Congratulations.
        EOF
        s/%(\w+)%/&getvaluefor($1)/eg;
        print;
        sub getvaluefor {
                my $key = shift;
                ...
        }

Here, the subroutine &getvaluefor will be called repeatedly, once for each keyword found in the text. Whatever string is returned by the subroutine will be value inserted into the final text. The subroutine can thus be arbitrarily complex, including having default values or cached computations.

But we're still a long ways away from what I did earlier -- having the code to execute within the template. It's really not that far away however, if we use the ``double evaluation'' mode of the subsitution operator. Let's look at this example:

        $_ = 'I have [ 3 + 4 ] eggs';
        s/\[(.*?)\]/$1/eegs;
        print;

This prints I have 7 eggs, but how? Well, eliminating what we know so far... the /s means that . can match a newline. And /g means that we are doing more than one substitution. And a single /e means that the right side is a Perl expression, not a double-quoted string. And in fact, we have $1 there, so that's good so far.

But the presence of the second /e means that the value of the expression on the right side should again be considered to be Perl code, and then evaluated for its string value! (This was initially considered to be a bug, but when it was noticed to be useful, retained as a feature.)

So it goes from $1 to " 3 + 4 " to 7, and the 7 gets inserted in place of the bracketed expression. We can have anything we want between the brackets, and it'll be evaluated as Perl code.

So, there you have it... many ways of having ``mostly constant, some variable'' text in your program. Let me conclude with a piece of history here. For many years, I used to end my postings in comp.lang.perl.misc with some clever (often obscure) chunk of code that would print out ``Just another Perl hacker,''. When I discovered the ``double eval'' trick for substution, I just had to use it in one of these ``JAPH'' postings. And here's the result:

  $Old_MacDonald = q#print #; $had_a_farm = (q-q:Just another Perl hacker,:-);
  s/^/q[Sing it, boys and girls...],$Old_MacDonald.$had_a_farm/eieio;

See if you can figure out how it works!

Special thanks to fellow Perl lead developer and trainer, Chip Salzenberg, for the idea for this month's column. Thanks Chip!

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.

Worldwide training and consulting by Perl experts

Copyright Notice

Unix Review Column 20 (June 1998)