Copyright Notice
This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Linux Magazine Column 97 (Sep 2007)
[suggested title: ``Always wear your utility belt (part 2)'']
Last month, I introduced the Scalar::Util super hero of the
Scalar/List-Util dynamic duo, describing how a somewhat-overlooked
standard library can simplify some of your common tasks. In this
month's column, I'll examine List::Util for the help it can provide
to your list tasks. I'll also look at List::MoreUtils for some
additional common list operations, if you don't mind a quick CPAN
install. (And you'll need to install List::Util from the CPAN
anyway if you're running something prior to Perl 5.8.)
Like Scalar::Util, the List::Util module doesn't export
any subroutines by default. That means that you'll need to ask
for each of these routines explicitly with use.
First, let's look at (the appropriately titled) first. Let's say
you have a list of items, and you want to find the first one that is
greater than ten characters. Simply pull out first, like this:
use List::Util qw(first);
my $big_enough = first { length > 10 } @the_list;
The first routine walks through the list similar to grep or
map, placing each item into $_. The block is then evaluated,
looking for a true or false value. If true, the corresponding value
of $_ is returned immediately. If every evaluation of the block
returns false, then first returns undef.
Note that this is similar to:
my ($big_enough) = grep { length $_ > 10 } @the_list;
However, the first routine avoids testing the remainder of the list
once we have found our item of choice. For short lists, we might not
care, but for long lists, this can save us some time if we expect a
true value somewhat early in the list.
We do lose a tiny bit of information with first as well. If undef
is a significant return value, we can't tell the undef as one of the list
members from the undef returned at the end of the list. For example,
if we wanted the ``first undef'' from a list:
my $first_undef = first { not defined $_ } @items;
we couldn't tell if this was returning a ``found'' undef, or a ``not
found'' signal (also undef). In the grep equivalent, we can see
whether there are zero or non-zero elements assigned:
if (my ($first_undef) = grep { not defined $_ } @items) {
# really found an undef
} else {
# no undef found
}
Admittedly, I can't recall where I've ever cared that much. But it's
an interesting thing to think about when designing return values from
functions. But enough on first. Let's move on.
The next easy utility to describe from List::Util is shuffle.
Yes, many programs need a randomly ordered list of values, and here
we have it as a simple word:
use List::Util qw(shuffle);
my @deck = shuffle
map { "C$_", "D$_", "H$_", "S$_" }
0..9, qw(A K Q J);
Now our deck of cards is shuffled, and rather fairly and quickly.
Like sorting, shuffling is one of those things that looks rather easy
to implement, but turns out to have tricky parts to get right. And in
the normal List::Util installation, this is implemented at the C
level (using XS), so it's quite fast.
One of my favorite ``obscure but cool once you understand it'' functions
in list-processing languages is reduce, and although Perl doesn't
have it is as a built-in, we can at least get to it with List::Util.
Similar to sort, reduce takes a block argument that references
$a and $b. This is best illustrated by example:
use List::Util qw(reduce);
my $total = reduce { $a + $b } 1, 2, 4, 8, 16;
For the first evaluation of the block, $a and $b take on
the first and second elements of the list: 1 and 2 in this case.
The block is evaluated (returning 3), and this value is placed
back into $a, and the next value is placed in $b (4).
Once again, the block is evaluated (7), and the result placed
in $a, and a new $b comes from the list. When there are no
more items in the list, the result is returned instead. The effect
is if we had written:
my $total = ((((1 + 2) + 4) + 8) + 16);
but scaled for however many elements are in the list. Nice!
We can use it to compute a factorial for $n:
my $factorial_n = reduce { $a * $b } 1..$n;
Or recognize a series of binary digits as a number:
my $number = reduce { 2 * $a + $b } 1, 1, 0, 0, 1; # 0b11001
We could even rewrite join in terms of reduce:
sub my_join {
my $glue = shift;
return reduce { $a . $glue . $b } @_;
}
By adding some smarts into the block, we can find the numeric maximum of a list of values:
my $numeric_max = reduce { $a > $b ? $a : $b } @inputs;
This works because we select the winner of any given pair of values, and if we keep carrying that winner forward, eventually the winningest winner comes out the end.
For a string maximum (``z'' preferred to ``a''), just change the type of the comparison:
my $numeric_max = reduce { $a gt $b ? $a : $b } @inputs;
And for minimums, we can change the order of the comparison, or swap
the selection of $a and $b.
For convenience, List::Util provides max, maxstr, min,
minstr, and sum directly.
I learned Smalltalk long before I learned Perl, and got quite fond
of the inject:into: method for collections. The reduce routine
maps rather nicely, if I think of Smalltalk's:
aCollection inject: firstValue into: [:a :b | "something with a and b"]
as Perl's:
reduce { "something with $a and $b" } $firstValue, @aCollection;
In other words, another way of looking at reduce is that it
transforms that first element into the final result by invoking the
block in a specific way on all of the remaining elements of the list.
So, you could put a list of elements inside an array ref with:
my $array_ref = reduce { push @$a, $b; $a } [], @some_list;
Or create a hash with:
my $hash_ref = reduce { $a->{$b} = 1; $a } {}, @some_list;
Note that on each iteration, $a is used, and also returned to
become the new $a or the final result. This is reminiscent of the
many uses of inject:into: in the Smalltalk images I've seen.
That wraps up List::Util, but I've still got a few inches of room
here, so let's take a quick look at the CPAN module
List::MoreUtils. Although it isn't part of the core, it's
referenced in List::Util, because the module provides a few handy
shortcuts implemented (again) in C code for speed. Like List::Util
all imports must be specifically requested.
The any routine returns a boolean result if any of the items
in the list meet the given criterion, using a $_ proxy similar to grep
or map:
use List::MoreUtils qw(any);
my $has_some_defined = any { defined $_ } @some_list;
This is done efficiently, returning a true value as soon as the block returns a true value, and iterating to the end of the list only if none of the elements meet the condition.
Similarly, all computes whether any of the elements fail to meet
the condition, returning false as soon as one of the elements fails,
rather than iterating through the entire list:
use List::MoreUtils qw(all);
my $has_no_undef = all { defined $_ } @some_list;
Note that you could easily define any in terms of all and
vice-versa, just by negating both the condition and the result value.
(These items are far more efficient than their same-named
``equivalents'' in Quantum::Superpositions.)
If you negate only the result values (or just the condition, depending
on how you look at it), you get two other routines defined by
List::MoreUtils, none and notall:
use List::MoreUtils qw(none notall);
my $has_no_defined = none { defined $_ } @some_list;
my $has_some_undef = notall { defined $_ } @some_list;
Like if vs unless or while vs until, having complementary
routines gives you the flexibility to spell out what you're actually
looking for, rather than requiring Perl (and the maintenance
programmer) to figure out what you mean with a bunch of not
operations.
If you're just counting true and false values, true and false
are at your service:
use List::MoreUtils qw(true false);
my $bigger_than_10_count = true { $_ > 10 } @some_list;
my $not_bigger_than_10_count = false { $_ > 10 } @some_list;
Again, these are complementary, so use the one that reads better for your task.
The first_index and last_index routines return where an
item appears. For example, suppose I want to know which item
is the first item that is bigger than 10:
use List::MoreUtils qw(first_index);
my $where = first_index { $_ > 10 } 1, 2, 4, 8, 16, 32;
The result here is 4, indicating that 16 is the first item
greater than 10. The index value is 0-based. If the item is not
found, -1 is returned, like Perl's built-in index search for
strings. last_index works like rindex, working from the upper
end of the list rather than the lower end.
A more general version of this is indexes (not indices as you
might think), which returns all of the index values instead of just
the first or last:
use List::MoreUtils qw(indexes);
my @where = indexes { $_ > 10 } 1, 2, 4, 8, 16, 32;
The result is 4, 5, showing that elements 4 and 5 of the input
list match the condition.
The apply routine is like the built-in map, but automatically
localizes the $_ value so we can safely change it within the block:
use List::MoreUtils qw(apply);
my @no_leading_blanks = apply { s/^\s+// } @input;
If we tried to do this with map:
my @no_leading_blanks = map { s/^\s+// } @input;
then we'd see two problems. First, the result of a substitution is
not the new string, but the success value, so the outputs would simply
be a series of true and false values. Second, the $_ value is
aliased to the inputs, so @input would have been changed. Oops.
The equivalent to the apply with map would be something like:
my @output = map { local $_ = $_; [apply action here]; $_ } @input;
And yes, the many times I've written map blocks that look just like
that, I could have replaced them with apply
And List::MoreUtils contains a few more routines as well, but I've
now run out of space. I hope you find this little trip into the
``utility belts'' of Perl fun and handy. Until next time, enjoy!

