Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Unix Review Column 57 (mar 2005)
[suggested title: ``Understanding the Command line'']
In past columns, I've talked a lot about the Perl language, but
haven't ever said much about perl
at the Unix shell command line.
So, let's fix that, by looking at some commonly used command-line
constructs for Perl.
Let's take the simplest invocation:
perl my-script
This invokes my-script
, using the relative or absolute path to the
script as given, thus not using the PATH
in any way. We can include
arguments to the script:
perl my-script arg1 arg2 arg3
which sets up @ARGV
to be the three individual values of arg1
,
arg2
, and arg3
, as if we had said:
@ARGV = qw(arg1 arg2 arg3);
If we want a space within one of the values, we need to use shell quoting rules:
perl my-script 'arg1a arg1b' arg2
This passes two arguments now, not three. We get the same result with:
perl my-script arg1a\ arg1b arg2
using a backslash to quote the space between the arguments. If there are any shell wildcard (``glob'') characters, the shell expands them before calling our program:
perl my-script *.html
which might turn into (given three matching files):
@ARGV = qw(index.html problem.html results.html);
Note that Perl has no clue that a shell wildcard was involved here: it's as if we had typed the three names individually.
Perl doesn't interpret the @ARGV
values in any particular way.
They could be keywords, filenames, or some combination of the two.
Traditionally, leading @ARGV
elements that begin with a minus are
considered ``options'', which we can process with modules such as
Getopt::Std
or Getopt::Long
.
We can also have options to Perl itself by placing leading-minus
values to the left of the script name. For example, we can invoke
the debugger by adding -d
:
perl -d my-script arg1 arg2 arg3
Now, the program is run under the normal Perl debugger. We can pick
an alternate debugger (or module using the debugging interface for other
analysis) with a colon argument following the -d
:
perl -d:DProf my-script
This command selects the Devel::DProf
module as the alternate
``debugger'', invoking a profiling of the Perl code.
Another common option (``switch'') is -c
, which compiles
a Perl script without executing it:
perl -c my-script
You would do this to verify that the syntax of your script is good
before actually moving it into place for production, including
ensuring that all use
'ed modules were also available. Any modules
loaded at runtime (via require
) or code constructed at runtime
(like eval
) wouldn't be checked, however. Also, all BEGIN
and
CHECK
blocks are executed, so ``compile only'' is merely a casual
definition.
You can enable warnings on the command line with -w
:
perl -w my-script
although this is more frequently handled within the program as:
use warnings;
Sometimes, your program is small enough that it makes sense to include
it entirely on the command line. Simply throw an -e
switch there
instead of the filename, and you're set:
perl -e 'print "Hello world!\n"'
Note that the quoting can get a bit weird. I typically use single
quotes to keep the single argument to -e
together, and Perl's
double quotes within the argument for Perl quoting. Sometimes, alternate
quoting (via q//
can come in handy):
perl -e 'print qq/Hello world!\n/'
Multiple -e
arguments are concatenated, with only a space character
between:
perl -e print -e 'qq/Hello!\n/'
By now, the number of options is a bit hard to remember. Luckily, Perl
has a built-in help message, available with -h
:
perl -h
And for a few more switches that aren't about running programs, let's
look at the version information with -v
in short form:
perl -v
and in long form with -V
:
perl -V
The -V
switch also gives us access to the various configuration
options that Perl was built with, and uses to compile binary extensions
and install local programs. For example, to get the C compiler used
to compile Perl:
perl -V:cc
and to get all the options related to where binaries are found or installed:
perl -V:'.*bin'
The regular expression pattern here is in quotes so that the shell doesn't try to expand it as a filename pattern. The output is in a form that can be evaluated by a bourne-style shell easily:
eval `perl -V:'.*bin'` echo $sitebin
No attempt is made to accommodate C-shell-style shells, of course.
Modules can be included from the command line with -M
:
perl -MFile::Find -e 'find sub { print $File::Find::name, $/ }, "."'
The -MFile::Find
is equivalent to including:
use File::Find;
in the resulting script. If you don't want the imports (such as
find
in this case), use lowercase -m
, or be specific with
a trailing =
syntax:
perl -MFile::Find=find,finddepth -e '...'
which turns into:
use File::Find qw(find finddepth);
Note the automatic comma splitting. Nice.
For text processing from a series of one or more files, we can add -n
,
which puts a wrapper around our program that looks like:
LINE: while (<>) { ... # rest of your program here }
In other words, the @ARGV
list is interpreted as a series of files
to be opened, and each line is placed in $_
until all the lines are
processed. To print each line with a sequential line number in front,
we can use the $.
variable for the numbers:
perl -n -e 'print "$.: $_"' file1 file2 file3
We can bundle the switches that don't take arguments together with the following switch, as in:
perl -ne 'print ...'
Another way to approach this problem is the -p
, which adds a print
at the end of the loop:
LINE: while (<>) { ... # your program here print; }
So, we could just substitute the line number into the beginning of each line:
perl -pe 's/^/$.: /' file1 file2 file3
Going one step further, we could rewrite these modified lines back
into the original files with the ``inplace edit'' switch: -i
:
perl -i.bak -pe 's/^/$.: /' file1 file2 file3
Now, file1
will be renamed file1.bak
, and the new updated
contents written to a new file1
. Similarly, file2
becomes
file2.bak
, and file3
becomes file3.bak
.
If you leave off the option to -i
, the ``inplace edit without backup
file'' mode is enabled, which can save space, but give you no way to go
back if you've toasted your files. Be very careful.
The line-looping modes (-n
and -p
) respect the current value of
$/
to read a ``line'', which defaults to \n
. However, you can
specify alternate values with the -0
(that's a zero) switch. By
default, -0
sets the delimiter to the NUL byte, which can be handy
with GNU find's <-print0> switch (which delimits the filenames with
NUL bytes):
find . -name '*.html' -print0 | perl -n -0 -e unlink
Any octal value can also follow -0
, indicating the corresponding
ASCII character. For example, to delimit only on spaces, use
-040
.
If the value is -0777
, then $/
is set to undef
, slurping the
entire file as one ``line''. Thus, we can wrap the entire file with
a BEGIN/END marker as:
perl -0777 -pi.bak -e '$_ = "BEGIN\n$_\nEND\n"' file1 file2 file3
Here, the statement is executed three times, with $_
being the
entire contents of first file1
, then file2
and file3
.
Note that the following command mangles the lines, because the concatenate is happening after the terminating newline:
perl -pe '$_ .= "END"' file1 file2 file3
But we can fix that with -l
, which chomps each line as read, and then
restores the delimiter on a print
:
perl -l -pe '$_ .= "END"' file1 file2 file3
Now the $_
contains only the line without a newline, and the
concatenate happens in the right place, before the newline that gets
automatically added by the implicit print
at the end of the
implicit loop.
Well, I hope you enjoyed this brief tour through the most common Perl
command-line options. You can read more at the perlrun
manpage,
available either as man perlrun
or perldoc perlrun
at your
prompt. Until next time, enjoy!