Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in WebTechniques magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
![]() |
Download this listing! | ![]() |
![]() |
![]() |
Web Techniques Column 54 (Oct 2000)
[suggested title: There can be only one ... more way to do it!]
I find it nice that with my familiarity of Perl, I can solve those
little ``emergency'' tasks without having to flip through a bunch of
manuals to spend time learning. For example, I had a problem the
other day that was causing horrible response time with my web server
for www.stonehenge.com
, and yet within a few minutes and a couple
dozen lines of Perl code, I was able to get things back in order.
My webserver is on a nicely-configured Linux box co-located at an ISP with 24 by 7 reboot service (although the box is rarely rebooted, as you'll see why). The box is actually shared with a dozen other e-commerce sites, and this is by design, because then when the box is down, it's not just me yelling at the admin, but a dozen others that are calling as well. Thus, we all have to play nice, because we're sharing the CPU and sharing the resources.
Well, one of the customers of this ISP is the regional sales office of a Very Large Company that has one of the largest market capitalizations in the world right now. (Why they don't run these applications on their corporate web site, I'm not sure, but I never asked. The usual answer is ``politics and local control'', so there's no point.) They apparently have some sort of free email newsletter that has subscribers counted in the mid-5-digits or so. A recent email newsletter (that went out over the weekend) basically said in effect ``we are terminating all mail subscriptions before the next issue unless you visit URL such-and-so and enter your renewal information''. In other words, if the subscriber didn't respond, they'd be dropped from the list.
Well, you can imagine the panic that this would generate on a Monday morning as thousands of people returned to work to discover that they might be removed from the mailing list. The given URL mapped to a CGI script, which was being invoked dozens of times simultaneously, so there were dozens of web-server processes (actually, both web-server and Perl process pairs). To make things worse, the first invocation of the Perl program gathered information about the subscriber, and then made a trip through DBI to a MySQL database, to present a confirmation form and opportunity to correct the subscription information. This form was then processed by a second invocation of the same CGI script, again reconnecting with MySQL, to update the information and finish the process.
I immediately began chatting with Doug, the author of the script and
the manager of the box to try to determine why the load average on a
box that is typically under 0.5 had now gone to something like 15 or
20, making my site nearly useless. After determining as many of the
facts that our IRC session would let us share, I quickly suggested
that Doug move the script into an Apache::Registry
area of his
mod_perl
-enabled server. At least this would prevent multiple
compilations and forks, and probably could reuse the DBI handle as
well. He was pretty adamant about not doing that, because he had
written the code long before he knew about mod_perl
, and thus had
likely done things that were not very clean from Apache::Registry
's
perspective. Additionally, he felt investing the time to make it
Apache::Registry
-compliant would be wasted, since this script would
eventually be moved to the client's machine which did not support
mod_perl
.
So, still watching the extremely high load average, I then suggested to him that he invoke ``the Highlander solution''. In the movie ``Highlander'', the catch-phrase is ``There can be only one!'', referring to the continual showdown amongst these immortal beings that would eventually kill all of each other off, leaving just one victor who would inherit ``the Prize''. Similarly, whenever exclusive access is needed to a resource, the word ``highlander'' is bandied about to mean an implementation solution or structure to control that resource.
Here, I was asking him to ensure through some locking mechanism that
only one CGI invocation was being processed at a time. I had in mind
a simple flock
at the beginning of the script, opening up a
sentinel file (no, not another television reference) and then
requesting an exclusive file lock on that handle. The first script in
would create the file, grab the exclusive lock, and then proceed on
its merry way, releasing the lock at the end of the program when the
handle was automatically (or explictly) closed.
If a second script should be started while the first is active, it would open the same file, and then attempt to lock the filehandle. At this point, the operating system would block the second process, leaving it sitting around in a suspended state until the first process had completed. Third and subsequent invocations would likewise be blocked, but the operating system releases only one process at a time for the exclusive lock.
So, as I'm trying to describe this over IRC, it becomes clear that Doug is not up to the task, so I spend a few minutes whipping up the solution. It looked like 5 lines of Perl, until I thought about what to do when the system was very busy, such as right while I was trying to get this fixed.
Let's say there were 15 script invocations. The 15th invocation would be sitting in a queue behind 1 active and 13 pending other processes. If the delay were substantial, the web server aborts the CGI processing, causing some 5xx-series error indicating a server malfunction with no clue about why this is happening. I didn't consider that very friendly, so I kept moving forward with the next idea.
I changed strategy to perform a ``non-blocking'' exclusive file lock, in a retrying loop. A normal lock is ``blocking'', in that the operating system does not return from the operation until the type of lock requested is available. However, sometimes blocking isn't wanted, such as when having an alternate resource is satisfactory. Or in this case, when we want to simply see if we can get an exclusive lock, and if not, try to get it later.
So the loop I constructed tries to get an exclusive lock 10 times, sleeping one second between invocations. Each try is a ``this moment only'' deal. If the lock is available, we nab it and move on, knowing that we're now king of the hill. If not, we wait a while, or give up. Again, recall I was typing this in a hurry, trying to get something working. And this hastily written code is presented in [listing one, below].
Recall that this is just a snippet added to a larger script, so the
normal #!
line won't appear. Line 1 brings in the CGI.pm
module, without any of the HTML-generating shortcuts. I left my
normal :all
parameter off the import list because I didn't want my
change to collide with any of Doug's existing code. In hindsight, I could
have switched to a temporary package like so:
{ package My::Highlander; use CGI qw/:all/; # rest of this snippet goes here }
And that way I could have avoided the use of the CGI::...
construct
later. Yeah, there's more than one way to do it alright.
Line 2 brings in two constants needed for the flock
operator later.
The Fcntl
module (which I have not yet found an easy pronunciation
for) defines many constants relating to file operations, and this is
certainly appropriate here. I've been thoroughly chastized on public
discussion areas for my use of literal numbers like 2 and 4 on
flock
in the past, so I want to make amends by doing it right.
Line 4 opens the sentinel file on the HIGHLANDER
handle in append
mode. The mode is mostly unimportant, except that we want to make
sure the file is created if it doesn't exist. The filename needs to
be in an area that is writable by the webserver userid, and /tmp
is
a safe bet. The CGI program was named renew.cgi
, hence the name of
the file relates to the name of the script. Death here will trigger a
500 error, but like I said, I was typing fast and furious to get this
to work so I could get back to work.
Lines 8 through 21 form a loop, to be executed 10 times. Repetitions
are controlled by the variable $count
defined and initialized to 0
in line 7. Because $count
is defined in a block started in line 6
(and ending in line 22), it cannot conflict with any other use of
$count
earlier or later in the program.
Line 9 attempts to obtain an exclusive lock on the file opened on the
HIGHLANDER
filehandle. The or'ing of the two values LOCK_EX
and
LOCK_NB
(to get the number 6, but I'm cheating to know that)
requests an exclusive lock, but in non-blocking mode. If the flock is
successful, we get a true return value, and the last
operator takes
us out of the block started in line 8. If the flock fails, we drop
through to line 10, which pauses the process for one second.
Line 11 increments count, and ensures that it is still below 10. If
so, the redo
operator pops back up to line 8, retrying the
flock
. If not, we've tried 10 times to flock, or ur, actually, 9
times to flock (durn fencepost off-by-one errors!), and it's time to
report the error.
Line 13 grabs the REMOTE_HOST
environment variable, which will help
us determine who indeed we are not serving this time. Since they have
reverse DNS turned on under this server, we should be getting a nice
domain name here of the host attempting to access this CGI (or at
least the intermediate proxy).
However, under some circumstances, the reverse DNS fails or is not
available. I couldn't remember if REMOTE_HOST
contains a numeric
dotted quad at that point (like 10.1.2.3) or whether it was undefined.
So to be defensive in my programming (remember, I'm under the gun
here), I simply used REMOTE_ADDR
in line 14 if REMOTE_HOST
was
undefined. Probably in five more minutes of poking around, I could
have determined that line 14 is probably unnecessary. But hey, it
worked, and again, that was the important part.
Line 15 dumps an error message to the web server error log, presenting
the program name (in $0
), the current time of day (from
localtime
), and an indication that we failed due to a ``highlander
abort''. I wanted the string to be distinct enough that we could
easily detect how successful this highlander code was in deterring
overloadings.
Lines 16 through 19 dump back the response for an abort. We print a
CGI header with a status of 503
, appropriately earmarked as
``service unavailable''. According to the specification, we can
additionaly send a ``retry after'' header along with this status
response, which compliant clients will be able to determine a later
time (measured in seconds) after which the service is likely to be
restored. Honestly, I don't know what the browsers on the market do
with 503 errors, but I'm at least following the standard.
Note that line 18 sends out a text/plain
MIME type. Again, being
lazy, I didn't want to write a full HTML page, so I took the quickest
way out, letting me just type a line of text in line 19 without adding
a lot of angley-brackety thingies.
Line 20 aborts the program, but with a nice exit status. Since we've ``handled'' the error, we don't want the web server to also go through its error trigger steps by exiting with a non-zero exit status.
And there it is. Whipped out in about 15 minutes, and installed immediately by Doug. But did it help?
It sure did. The load average shot down from the mid-20's to just
around 2 or so, very tolerable. We both watched the error log, with
tail -f
to see how many people were getting turned away in relation
to the customers being served, and found that 70% of them were getting
through just fine, and because they weren't all trying to compete in
parallel, they were actually getting done with minimal fuss. Perl
saved the day!
So, the next time you have an expensive script burning up too much CPU, maybe you too need to utter in your best Sean Connery accent: ``There can be only one!'' Until next time, enjoy!
Listings
=1= use CGI; =2= use Fcntl qw(LOCK_EX LOCK_NB); =3= =4= open HIGHLANDER, ">>/tmp/renew.cgi.highlander" or die "Cannot open highlander: $!"; =5= =6= { =7= my $count = 0; =8= { =9= flock HIGHLANDER, LOCK_EX | LOCK_NB and last; =10= sleep 1; =11= redo if ++$count < 10; =12= ## couldn't get it after 10 seconds... =13= my $host = $ENV{REMOTE_HOST}; =14= $host = $ENV{REMOTE_ADDR} unless defined $host; =15= warn "$0 @ ".(localtime).": highlander abort for $host after 10 seconds\n"; =16= print CGI::header(-status => 503, =17= -retry_after => 30, =18= -type => 'text/plain'), =19= "Our server is overloaded. Please try again in a few minutes.\n"; =20= exit 0; =21= } =22= }