Copyright Notice
This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
![]() |
Download this listing! | ![]() |
![]() |
![]() |
Linux Magazine Column 14 (Jul 2000)
[Suggested title: Capturing those CGI errors as Email]
More and more web hosting services and ISPs are providing CGI space in addition to customer web pages, either as a free add-on, or an extra-cost service. And there's even a few free CGI servers out there on the net.
The problem with these services is that the (shared) web error log is
often inaccessible, or at an unknown location. That's fine if your
CGI program never commits an error, or if you are using the
PSI::ESP
module to determine the error text. But most of us will
write ``blah blah or die blah'' in our CGI scripts, expecting to somehow
be told what's wrong when it goes wrong.
Some have resorted to dumping the error message to the browser. In fact, during development, there's nothing wrong with adding:
use CGI::Carp qw(fatalsToBrowser)
to your program, and doing the debugging right in your browser window.
(For details, see the CGI::Carp
documentation.)
But this is a huge security hole if left in production code. While surfing the world wild web, I often see error messages that reveal far too much information. I've seen program names, user IDs, languages used, pathnames to key files, and even the exact SQL query attempted dumped out in these errors. I have no right (or need) to know that, and a bad guy can use such valuable information to assist him in breaking into the system.
So, if we can't put it into the browser, and we can't get to the error log, where else can we put the errors? Why, in email of course!
All we need is a module (let's call it FatalsToEmail.pm
) that we'll
stick somewhere on the system (like /home/merlyn/lib
), and pull in
at the top of our CGI script, like so:
use lib "/home/merlyn/lib"; use FatalsToEmail qw(Address merlyn@stonehenge.comm);
And then when the CGI script dies, the text of the error message gets sent to me, while the user is told that ``something went wrong''. Too cool? Yup, so read on.
The module source is in [listing one, below]. Line 1 sets up the package, important because we don't want any symbols to collide with the user of this module. Line 2 enables my favorite compiler restrictions, including requiring me to declare my variables, discouraging the use of symbolic references, and preventing barewords from being treated as quoted strings.
Llines 4 through 10 set up the four configuration variables for
the module. The Address
provides the email address to which the
messages should go, here defaulting to webmaster
at the mail host.
Speaking of which, Mailhost
sets up the mail delivery host. This doesn't
have to be the final machine on which the mail ends up, but we'll
need a friendly SMTP server somewhere that can handle mail from the
script. The localhost
default should be fine for most machines,
except those that don't run a mailserver on the webserver.
The Cache
and Seconds
parameter interact to limit the amount of
mail delivered. The default Cache
value of undef
gives the
script the right to deliver a single separate piece of email for each
fatal error. This is great for testing or for low-volume sites. But
it'd be a potential ``denial of service'' attack for high-volume sites
or malicious users.
So instead, we bunch up the rapidly appearing messages into a cache,
guaranteed to be sent no more often than the indicated number of
seconds. To get the bunching up, Cache
must be set to a filename
path that is writeable by the user ID executing the CGI program. A
typical value might be something like /tmp/merlyn.weberrors.cache
.
(The actual caching strategy is defined below.) Several programs can
share the same cache: the error messages within the cache are prefixed
by the filename and line number from which they sprouted.
Lines 12 though 20 handle the configuration of the module. If the
use
line appears like:
use FatalsToEmail qw( Address merlyn@stonehenge.comm Cache /tmp/merlyn.weberrors.cache );
then we save Address
and Cache
to override the default values.
The logic in line 15 ensures that someone can use cache
or CACHE
or even <cAcHe> for the identifier tag, and we'll still store it into
the right hash slot.
Line 22 establishes the subroutine in this module as the die
handler. From here on out, we're the ones that will get called on a
fatal error.
Lines 24 through 44 define this handler. The text message for the
fatal error shows up in $message
in line 25. Line 26 gets the
current local time of day to label the message consistently. Line 27
extracts information about the filename and line number from which the
error message was triggered.
Lines 29 through 31 prefix each line in the message with a unique identifier, consisting of the file name, line number, time of day, and process ID number. This is helpful to group error messages in cache-dumping email, as well as provide the necessary locators to fix the problem.
Lines 33 to 39 dump a CGI response. Note that minimal information is provided.
If the CGI program has already sent an HTTP header, the header we print in line 34 will show up as content. There's nothing much I can do about that from this module, at least not in a CGI environment.
Then there's the cleanup. Line 41 triggers the email (or caching, if
needed). And line 43 executes a die
within the die handler. This
step is needed so that Perl knows to finish aborting the program. The
message shows up on STDERR
, which will typically be the real web
error log.
Lines 46 to 89 attempt to phone home with the error message. (Perhaps
I should have called this subroutine ``e_t''?). Nearly everything is
inside a large eval
block so that any mistakes will still set up a
graceful exit from this program. The message is captured and
delivered in lines 86 to 88, including both the original message that
wanted to be mailed, as well as the error that kept it from being
mailed.
Lines 50 to 75 handle the cache, if needed, as determined by line 51. If we're caching, then line 52 attempts to open the cache file for both reading and writing. If that succeeds, it's time to operate.
Line 53 blocks the process until we're the only one using the file in this manner. We'll want to keep the time to a bare minimum from here until we close the filehandle, because we've just entered a zone where only one process at a time can be within.
Lines 55 to 62 handle the case where the cache is old enough for us to
send. If the file modification time (``mtime'') is more than some
number of seconds ago (determined by the Seconds
configuration
variable), then it's been a while since we wanted to send some email,
and there may in fact be previous contents that we deferred. So lines
57 and 58 grab that. If there's more than 8K in the cache, we send
only the first 8k and a warning. This keeps the mail message from
becoming yet another denial of service attack: filling up our mail
spool. At most, we'll get roughly 8K every 60 seconds (or whatever
Seconds
is configured as). The old cached messages are prepended
in front of the current message in line 59.
Lines 60 and 61 remove the cached material, so that we have an empty file that has been modified just now, regardless of whether any prior content was in the file. This is important, because we want the next hit within the cache window to be deferred, repeatedly, until we have another idle period. And finally, line 62 closes the cache, also releasing the lock, since we now have the information we need.
Lines 64 to 67 handle the hits within the cache window, no more than
Seconds
number of seconds since the previous hit. In this case, we
just seek to the end of the file, and dump the message there. We are
guaranteed that the message will end in newline, but if I wasn't sure,
I'd add a \n
here somewhere, to ensure that each error message has
lines that start with the prefix identifier computed earlier. The
return
in line 68 skips over all the email handling code below,
since we won't be sending any mail on this trip.
Lines 71 to 73 create an initial empty cache file if it doesn't exist. We'll treat a non-existing file as if it was an empty old file, which means it still needs to have a timestamp updated to ``right now''. Note the use of the ``append'' open operation: since we don't have a lock, we may be trying to create a file when there's already someone else in the meanwhile that has created the file, gotten a lock, and started writing in the file (it could happen!), so the best we can do is ask the kernel to ``make the file if it doesn't exist, or be ready to append to it if it does''. Which works here just as we needed it.
Now for the fun part. Lines 77 to 84 send the email message. First,
we try to suck in the Net::SMTP
module in line 77. This may not be
possible, because the CPAN module may not be installed (it's part of
the libnet
bundle from Graham Barr, not part of the core
installation). However, the require
directive might fail, so it's
inside an eval
. If the require
succeeds, the value of 1
is
returned, stopping our inner die
operation, otherwise we'll
abort. The die
here is being caught by the outer eval
. Wheeee.
Lines 78 and 79 set up a Net::SMTP
connection object to the
requested Mailhost
. If there are any errors, I'm told the error
will be in $@
, not <$!>, so I include that here in the die
message. Again, this die
will be caught by the outer eval
block.
Lines 80 and 81 tell the SMTP server what our sender name is, and what the recipient name is. The sender name will be used for error messages from the various mailers along the way, and the recipient name is the ultimate destination. Here, we're using the configured email address for both. This could get weird if the address is undeliverable: the final mail host will attempt to bounce the message back to the same address to which it is attempting delivery. Hmm. Not a good idea. But it's better than any alternative I could think of today.
Lines 82 and 83 provide a subject line and a body for the message.
The subject line has nice bright shiny capital letters in it,
including the name of the program that triggered the error for easy
mail filtering by smart email readers. Note that most mail servers
will also construct a Date
and From
and To
header for us
automatically, so I can lean on it to do the job.
And finally, line 84 tells the mail server we're done for this round, and shuts down the connection.
That's it. Put FatalsToEmail.pm
some place accessible to your CGI
script, add the appropriate use lib
line to point at the directory,
and you can start getting your errors via a timely email message
instead of having to scruff through the old shared web error log.
Until next time, enjoy!
Listings
=1= package FatalsToEmail; =2= use strict; =3= =4= my %config = =5= ( =6= Address => "webmaster", # email address =7= Mailhost => "localhost", # mail server =8= Cache => undef, # undef means don't use =9= Seconds => 60, =10= ); =11= =12= sub import { =13= my $package = shift; =14= while (@_) { =15= my $key = ucfirst lc shift; =16= die "missing argument to $key" unless @_; =17= die "unknown argument $key" unless exists $config{$key}; =18= $config{$key} = shift; =19= } =20= } =21= =22= $SIG{__DIE__} = \&trapper; =23= =24= sub trapper { =25= my $message = shift; =26= my $time = localtime; =27= my ($pack, $file, $line) = caller; =28= =29= my $prefix = localtime; =30= $prefix .= ":$$:$file:$line: "; =31= $message =~ s/^/$prefix/mig; =32= =33= print STDOUT <<END; =34= Content-Type: text/html =35= =36= <h1>Sorry!</h1> =37= <p>An error has occurred; details have been logged. =38= Please try your request again later. =39= END =40= =41= send_mail($message); =42= =43= die "${prefix}died - email sent to $config{Address} via $config{Mailhost}\n"; =44= } =45= =46= sub send_mail { =47= my $message = shift; =48= =49= eval { =50= ## do I need to cache this? =51= if (defined (my $cache = $config{Cache})) { =52= if (open CACHE, "+<$cache") { =53= flock CACHE, 2; =54= ## it's mine, see if it's old enough =55= if (time - (stat(CACHE))[9] > $config{Seconds}) { =56= ## yes, suck any content, and zero the file =57= my $buf; =58= $buf .= "\n...[truncated]...\n" if read(CACHE, $buf, 8192) >= 8192; =59= $message = $buf . $message; =60= seek CACHE, 0, 0; =61= truncate CACHE, 0; =62= close CACHE; =63= } else { =64= ## no, so just drop the stuff at the end =65= seek CACHE, 0, 2; =66= print CACHE $message; =67= close CACHE; =68= return; =69= } =70= } else { =71= ## it doesn't exist, so create an empty file for stamping, and email =72= open CACHE, ">>$cache" or die "Cannot create $cache: $!"; =73= close CACHE; =74= } =75= } =76= =77= eval { require Net::SMTP; 1 } or die "no Net::SMTP"; =78= my $mail = Net::SMTP->new($config{Mailhost}) =79= or die "Net::SMTP->new returned $@"; =80= $mail->mail($config{Address}) or die "from: $@"; =81= $mail->to($config{Address}) or die "to: $@"; =82= $mail->data("Subject: CGI FATAL ERROR in $0\n\n", $message) =83= or die "data: $@"; =84= $mail->quit or die "quit: $@"; =85= }; =86= if ($@) { =87= die "$message(send_mail saw $@)\n"; =88= } =89= }