Copyright NoticeThis text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.
This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
[suggested title: ``Computing Securely'']
Security is everybody's business. You may ask yourself, ``Why should I take security seriously? I don't have anything on my system that's worth exploiting.'' Well, that's exactly what the bad guys want you to believe.
Whether you think you have useful data or not, your box provides an identity tied to you, and can be used to mislead the people in pursuit of another exploit, or completely sever the tracks, leaving you holding the bag. And, your machine has resources, like CPU, disk, and network interface, that can be abused by the bad guy to stage larger attacks, like distributed denial-of-service attacks, or monitoring traffic from a new vantage point.
Given that security is everyone's business, let's look at the most common exploits, and what we need to do about them, focusing on the Perl aspects of those points:
The latest Perl releases nearly always include bug fixes and code enhancements that improve the security of your system. This was especially true in the great transition from Perl 5.003 to Perl 5.004. If you aren't running at least 5.004, it'd be a very good time to just abandon that old code very soon.
Also note that many of the CPAN libraries get updated frequently, and
some of those updates are also security-related. If you aren't using
r function on a regular basis, you may be leaving
yourself vulnerable to the latest attack.
If you don't know what your code is doing, how do you know your code isn't doing exploitable or harmful things? There's a lot of noise out there that says ``Perl is unmaintainable'', but I'm going to argue instead that ``too many people write unmaintainable Perl needlessly''. Well-written and documented Perl coded with sound engineering principles is actually quite easy to update and modify.
If you've inherited a codebase that is ugly, please make it a high-priority item to get your boss to let you rewrite your code.
Exploits generally happen because some part of the data was trusted to
be within a certain range or certain shape, and a bad guy did
something else instead. Don't trust your input data. Verify that it
contains acceptable characters of the right length or right shape.
Obviously, Perl's regular expressions help quite a bit here. Also
look at things like
Regex::Common in the CPAN
to help you with validation.
Additionally, Perl's taint mode can help you track when incoming data has somehow leaked all the way through to some output-affecting operation without being validated. But buggy (or worse, blind) validations can defeat these checks, so don't count on taint checks to do your work for you.
Another common exploit occurs when input data (either unchecked or
improperly checked) is used as part of a filename. When you do
something like this, be extra cautious. For example, when you allow
an input parameter to select a file within a directory, be sure the
input filename doesn't look like
One security mantra I recite is ``don't let code become data, or data become code''. When your executable code can be accessed like data, the bad guys might be able to determine algorithms or locations of secrets. But worse, if the bad guys can get data they control to become code, you've lost the battle. Be sure you completely separate where your code is located, and where your data is stored. For example, don't put data files in your web's CGI directory: that's just asking for trouble.
While Perl has no known buffer-overflow exploits, you might be using
Perl to call programs that might have problems. Be careful to limit
the size of data you pass to programs when you use
Similarly, Perl is fine with a string that contains a NUL (
byte. But many of the system calls and child programs aren't. Again,
be very careful that you don't permit such a character to be created
(and it's trivial from CGI parameters for example) and then passed on
where it'll be misinterpreted.
The famous Robert Morris ``Internet Worm'' exploited a mostly-enabled
``debug'' mode in
sendmail to leap from system to system. If you
have a debug mode or testing mode, be sure it gets turned off when
your system is in production. Don't simply exclaim ``but I might need
that if something breaks''. Fine, turn it on when something breaks,
but not before.
Certainly, the user wants to know when something breaks, but usually the user seeing the error message isn't the person who has to fix the code. They don't need to know the precise SQL that triggered the error, or the name of the database being accessed, and so on, because that stuff is all very useful information to a bad guy.
The most frequent violation I see of this principle is when
use CGI::Carp qw(fatalsToBrowser) is left enabled on production code.
This is wrong, very wrong. The random user at the other end of the
HTTP connection needs only to be told that ``something went wrong'' and
``we're looking in to it'', and maybe a unique timestamp so they can
report the error. Everything else that would have printed should be
captured in an error log somewhere, not sent to the user for exploit.
Most decent programmers seem to grasp the execution of their program as a single thread. But it takes extra discipline and thinking to understand all the places where a program can break (accidentally and deliberately) when multiple instances of the program are executing.
While this could be an entire article in itself, the two main points
here are to generate good temporary filenames (using
a good start), and flock your shared data to prevent incompatible
simultaneous updates (using the
Especially in a web environment, you'll need to have session IDs to
track that subsequent page hits are all related. Don't use guessable
session IDs, because a bad guy might be able to hijack another user's
session if the session ID is guessed, possibly accessing previously
entered payment credentials or other secure information.
Apache::Session contains an example of a decent session ID
generator, but you might also consider
Math::TrulyRandom and other
similar modules to generate very very hard-to-guess numbers.
Don't put your passwords into your scripts! From time to time, someone will ask me, ``hey, can I have the code for that cool thing?'', and without thinking, I'll attach that to my reply mail. Until I made it a habit to put the passwords into a separate file, I revealed my access codes more than once. Putting the passwords into a separate file also permits the file to be lightly encrypted, although anyone with access to both the code and the file can obviously decrypt the data trivially.
Understand that HTTP basicauth security transmits passwords in the
clear. Tools are readily available to sniff the network traffic on a
segment and display these passwords. If you care about security, be
sure you are using SSL (
https:// URLs) for anything that needs to
be authenticated securely. Even if you think you're on a secure wire
in-house, consider the user at the WiFi access point at the local
coffee shop or bookstore, which is trivial to sniff.
User-uploadable HTML can trigger ``Cross Site Scripting'' attacks, permitting one bad guy to steal the credentials of other innocent victims visiting the same site.
User-uploadable HTML can also execute arbitrary code if the page is server-side-include parsed.
If you have a web application like a message system, chat room, or
guestbook, escape the HTML (using
HTML::Entities), or control the
permitted HTML very carefully (using tools made from
Have a healthy distrust for scripts from amateur sources. The early web days made heroes out of the early adoptors of Perl, and unfortunately, their legacy is usually a collection of poorly written Perl scripts using old and unaudited techniques. Look at the code, and if you can't understand it, or if you find it looks bad, then don't use it! Ask around if you must.
The multi-argument version of
exec permits a child
process to be launched without a shell being involved. This is good
because nearly every non-alphanumeric character means something
special to most shells, and can trigger command invocations that you
didn't intend. For example:
system "gzip $somefile";
might do a lot more than invoke
gzip on a file, if that filename
contained backquotes, vertical bars, newlines, semicolons, and so on.
So use this instead:
system "gzip", $somefile;
Now that a shell can't be involved, the invocation is much more secure.
And just which
gzip was being invoked in that last example? Think
PATH, including where it was set, and how it might
change. Watch out in particular for a trailing
directory) in your path, or worse, leading <.>!
require operator (on which
use is built) looks at
determine the list of scanned directories. This list is built-in, but
can be affected by arbitrary user code and the setting of environment
variables such as
PERL5LIB. If a bad guy can control the list,
they can replace
strict.pm with their own code, and that'd be a bad
thing. Don't let that be an exploit!
Far too often, I've seen a child process used when it wasn't necessary. For example:
chomp(my $now = `date`); system "rm", $somefile;
These are bad from a performance perspective, but also from a security view, as they can both be done ``in house'':
my $now = localtime; unlink $somefile;
thus avoiding an expensive launch of a program. Yes, but which program?
PATH exploit makes these both rather vulnerable to mistakes.
You should also continue to master Perl if you want to ensure security. For example, if alarm bells don't immediately go off in your head when you see something like this:
$input = /(\w+)/; my $keyword = $1;
then you need to keep studying. The problem with this code is that
when the match fails,
$1 is left over from a previous match. This
kind of code can be used as a security exploit, if the attacker can
access the source code or have an idea that this is happening. It's
code that ``looks right'' but definitely isn't.
I've just barely scratched the surface here. Learning about security is an ongoing process, especially since for every countermeasure, someone is working right now on a new exploit. For Perl security, start with the perlsec manpage included in the standard distribution.
If you've got the inclination, spend at least a month or two reading the BUGTRAQ, CERT, and RISKS mailing lists. If you thought that a virus or two a month was a lot, you might be surprised at the dozens of daily exploits listed on these mailing lists.
And finally, search the web for keywords like ``security FAQ'' and ``CGI security FAQ''. You'll see a lot of common themes in these documents, but I generally find one or two new interesting things every time I look. We can't all be security experts, but we're all responsible for security. Until next time, enjoy!