Copyright Notice
This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
![]() |
Download this listing! | ![]() |
![]() |
![]() |
Linux Magazine Column 17 (Oct 2000)
[Suggested title: Throttling your web server]
The webserver for www.stonehenge.com
is a nicely configured Linux
box (of course) located at a nice co-location facility and maintained
by my ISP. I share the box with a dozen other e-commerce clients
(mostly because I've been too lazy and/or to move the server to a new
solitary box), and that keeps me and everyone else on our toes about
overloading the server, because we all have to share.
I bought a digital camera some large number of months ago, and started
putting nearly every picture I took up on the site. I've got a nice
mod_perl
picture handler to show the thumbnails, provide the
navigation, and even generate half-size images on the fly using
PerlMagick.
However, as I put more and more pictures online, I started to notice some pretty creepy CPU loads from time to time. Worse than that, my ISP neighbors were also starting to complain. After investigation, I determined that I was getting hit by not-so-nice ``spiders'': web programs that recursively (and rapidly) fetch the contents of many pages given a few starting points. I believe most of these to be people on fast data connections (like my current cable modem that brings the equivalent of 2 T-1's into my house for $40 per month, yes!) innocently asking their web browser to download a whole area.
So, rather than pull my pictures offline, I decided to implement a throttling. I didn't care so much about transfer bandwidth as I did CPU, so I chose to track recent CPU activity for each visitor. Of course, HTTP has no concept of a ``session'', so I took a very easy shortcut: tracking by IP address. Yes, I know, I've ranted in discussion forums a lot about how an IP address is not a user. But for the purpose of throttling, it seemed the most expedient choice.
Once I put my throttler in place, no IP address is allowed to suck
more than 7% of my CPU over a period of 15 seconds. Once the CPU
threshold is reached, any additional request is met with a 503
error (service unavailable), which according to RFC2616 (the HTTP/1.1
specification) also allows me to give a ``retry after'' value of 15
seconds to advise the program that this was a temporary condition.
The throttler consists of two related mod_perl
handlers: an
``access'' handler to note whether or not the IP address is currently
permitted, and a ``log'' handler to track the CPU used by the transfer.
Additionally, there's an external program triggered by cron to
clean up the status files needed by the handlers.
So, let's take a look at the handlers in [LISTING ONE, below].
Line 1 puts the module into Stonehenge::Throttle
. I use
Stonehenge
as a private prefix for all my local mod_perl
goodies, to keep it separate from any CPAN-installed modules. Because
mod_perl
shares the namespace across all modules, it's very
important to have a workable naming allocation to keep things from
colliding.
Line 2 selects the critically important compiler restrictions.
Designing code for mod_perl
handlers requires careful attention to
details, and the use strict
restrictions are a good start to that.
Line 4 reminds me that this module needs to be installed as a
PerlAccessHandler
by giving the appropriate syntax. I have it
selected at the top-level configuration file of my site, but if I had
wanted it only for the pictures directory, I could have put the access
handler inside a Directory
or Files
restriction, or even an
.htaccess
file in a subdirectory.
Lines 6 through 9 define some configuration constants. Line 6 is a
directory that must be writable by the web userid (in my case,
nobody
). This directory will hold the historical information about
CPU usage.
Line 8 defines the seconds in which we compute CPU history. If we
make this too large, the throttling will be slow to react. If we make
it too small, it'll be a knee-jerk reaction. I've tweaked this number
up and down from time to time, but the current number is 15 as shown
here. Line 9 defines how much CPU a particular IP address is allowed
to consume, in percent, over the period of time given by $WINDOW
.
I found the 7 percent solution to be appropriate.
Lines 11 and 12 define a version string, which can be queried using
the mod_perl
maintenance tools, as well as being in the right
format should I ever get around to submitting this to CPAN. The
string comes from an RCS keyword, so I just check the file out and in
and get the right version number automatically.
Lines 14 through 16 pull in some standard constants and modules from
the mod_perl
interface.
Line 18 begins the handler called on each requested transfer. Line 19
is commented out, but when enabled, uses my Stonehenge::Reload
module to automatically reload this module whenever it changes. Since
I'm pretty happy with the stability of this module, I've commented the
line out. (Stonehenge::Reload
hasn't been published, even though
I've now referred to it in a few of my other published works. Perhaps
someday soon I should talk about it, I suppose.)
Line 21 fetches the incoming request. This will be an
Apache::Request
object, as defined by the mod_perl
interface.
Line 22 ignores any requests that are not a request generated by an
external query. This keeps internal lookups (like to get the MIME
type for a directory index) from accidentally triggering the
throttler. Line 23 grabs a log object for later use.
Lines 25 to 28 get the hostname of the remote server, and perform some slight massaging. If the hostname is my ISP, it means I'm performing some request directly, and I sure don't want to be throttling myself. Also, I decided that all Google fetches should be charged to the same host, even though they appear to be coming from different hosts. Yes, I throttle even Google if it gets too sucky on my pages.
Lines 30 through 33 set up a few variables that will be needed for both this handler, and the ``log'' handler that will be set up later. We'll note the filename of the CPU history file, the flagfile indicating the host is currently blocked, and the current CPU usage for both this process and its children.
Lines 35 through 59 ``push'' a log handler. This technique allows one handler phase to create a handler for another phase ``on the fly''. More importantly, it allows me to share the values of some of the variables into the later phase.
Line 40 subtracts the current value of the output of the times
operator from its previous value (saved earlier in line 32). Lines 41
to 43 compute the sum total of CPU used, and rounds it off to the
nearest hundredth of a second. Line 44 posts a notice in the error
log, which I used for debugging, but have commented out now.
Lines 45 to 48 add this CPU usage as a eight-byte value to the end of a history file. The first four bytes define the timestamp second at which the observation is being taken, and the last four bytes are the CPU seconds in units of hundredths of a second. The advantage of this format is that it's very easy to go back from that to a value (no decimal conversion) and an append will always be atomic, so there's no need to flock the file!
The rest of the log handler determines whether future requests should be blocked or not. First, line 50 defines the beginning of the window of interest. If there's already a currently blockfile, lines 52 through 59 note that and exit the loghandler, so we don't even have to think very hard.
Lines 62 to 70 walk the history file, grabbing each eight byte string as a separate entry, converting it back to the timestamp and CPU used. For all the entries that occur within the window, we'll figure a total CPU. Older entries are ignored.
Lines 72 to 76 determine if the CPU is below the throttling percentage, and if so, remove any blockfile that may be present, thus letting future transactions proceed unthrottled (until the CPU is overused again).
But if we make it to line 78, we've got an IP address out there that has exceeded our threshold. Lines 79 to 81 grab the load average for logging purposes only. Line 83 likewise grabs the user agent for the log. (I've used this to determine if I should categorically deny bad user agents based on name rather than action.) And line 86, well, 86's them from the establishment by creating an empty blockfile. (The presence or absence of the blockfile is all that matters to the access handler.)
So, that's it for the log handler. Back in the access handler starting in line 94, we look for the blockfile that the log handler manages. If it's there, and new enough, we're blocking. Line 97 adds a clue for the client that we do indeed want them to come back, but just not right away. Line 98 triggers the 503 error and aborts any further access within this transfer.
And that's the mod_perl
side of things. But now we have these neat
little CPU history files being created in $HISTORYDIR
, and there's
nothing in either handler to clean them up. And I can't add anything
there, because the only time the file should be removed is when
there's nothing happening, but the only time I'm in a handler is when
something is happening!
So, there's a little program invoked from cron on a regular basis, using a crontab entry similar to:
3-59/10 * * * * /home/merlyn/lib/Apache/throttle-cleaner
which invokes the program I present in [LISTING TWO, below] every 10 minutes on minutes that end on 3 (3, 13, 23, etc). I try to invoke my cron stuff on unlikely minutes to avoid crowding with all those lusers that use precise multiples of 5 or 15. Bleh.
Because this is a standalone program, we've got the ``sh-bang'' line, with warnings turned on in line 1. Line 2 is the normal compiler restrictions.
Line 6 defines the same directory as the
$Stonehenge::Throttle::HISTORYDIR
, so if I change one, I need to
change the other. It won't help to delete files that aren't in the
same place. Line 7 similarly needs to be at least twice as large
as the throttling window.
Lines 9 through 17 skip through the directory, looking for any file
that has not been accessed in at least $SECS
. For blocking files,
this means that we've not seen a transaction since the blocking
started. (Good, they went away permanently.) For history files, it
means that we've not seen a transaction recently. In either case, the
information is no longer of use, so we can destroy the file (in line
16).
And there you have it: a mechanism to keep people from making your
ISP-neighbors mad at you. As a testimony to its value, I recently got
``slashdotted'' by having my pictures archive for ``YAPC 19100'' mentioned
on www.slashdot.org
. My hits per hour went to 20 times their
normal pace for about 36 hours after the mention, and yet the
loadaverage never got above 1 or 2 during the entire ordeal. So, I've
now survived a slashdot attack.
Another success story comes from one of my clients: a Very Large
on-line toys and games e-tailer. They told me that they had seen an
earlier version of my throttler mentioned on the mod_perl
mailing
list, and had put it in place (with some modifications) during the
past Christmas buying rush. And amazingly enough, it caught many
attempts by people accidentally or deliberately attempting to download
their entire online catalog for offload browsing: something that
would be both useless and prohibitively expensive. Without the
throttle, they might have lost literally millions of dollars. They
did in fact buy me dinner for that. Thank you.
I'm interested to hear how this kind of code saved your bacon, so if you adapt it, let me know. Until next time, enjoy!
Listings
=0= ################ LISTING ONE ################ =1= package Stonehenge::Throttle; =2= use strict; =3= =4= ## usage: PerlAccessHandler Stonehenge::Throttle =5= =6= my $HISTORYDIR = "/home/merlyn/lib/Apache/Throttle"; =7= =8= my $WINDOW = 15; # seconds of interest =9= my $DECLINE_CPU_PERCENT = 7; # CPU percent in window before we 503 error =10= =11= use vars qw($VERSION); =12= $VERSION = (qw$Revision$ )[-1]; =13= =14= use Apache::Constants qw(OK DECLINED); =15= use Apache::File; =16= use Apache::Log; =17= =18= sub handler { =19= ## use Stonehenge::Reload; goto &handler if Stonehenge::Reload->reload_me; =20= =21= my $r = shift; # closure var =22= return DECLINED unless $r->is_initial_req; =23= my $log = $r->server->log; # closure var =24= =25= my $host = $r->get_remote_host; # closure var =26= return DECLINED if $host =~ /\.(holdit|stonehenge)\.com$/; =27= return DECLINED if $host =~ /\.metronomicon\.com$/; # poor purl =28= $host = "googlebot.com" if $host =~ /\.googlebot\.com$/; =29= =30= my $historyfile = "$HISTORYDIR/$host-times"; # closure var =31= my $blockfile = "$HISTORYDIR/$host-blocked"; # closure var =32= my @delta_times = times; # closure var =33= my $fh = Apache::File->new; # closure var =34= =35= $r->push_handlers =36= (PerlLogHandler => =37= sub { =38= =39= ## record this CPU usage =40= @delta_times = map { $_ - shift @delta_times } times; =41= my $cpu_hundred = 0; =42= $cpu_hundred += $_ for @delta_times; =43= $cpu_hundred = int 100*($cpu_hundred + 0.005); =44= ## $log->notice("throttle: $host got $cpu_hundred/100 in this slot"); # DEBUG =45= open $fh, ">>$historyfile" or return DECLINED; =46= my $time = time; =47= syswrite $fh, pack "LL", $time, $cpu_hundred; =48= close $fh; =49= =50= my $startwindow = $time - $WINDOW; =51= =52= if (my @stat = stat($blockfile)) { =53= if ($stat[9] > $startwindow) { =54= ## $log->notice("throttle: $blockfile is already blocking"); # DEBUG =55= return OK; # nothing further to see... move along =56= } else { =57= ## $log->notice("throttle: $blockfile is old, ignoring"); # DEBUG =58= } =59= } =60= =61= # figure out if we should be blocking =62= my $totalcpu = 0; # scaled by 100 =63= =64= open $fh, $historyfile or return DECLINED; =65= while ((read $fh, my $buf, 8) > 0) { =66= my ($time, $cpu) = unpack "LL", $buf; =67= next if $time < $startwindow; =68= $totalcpu += $cpu; =69= } =70= close $fh; =71= =72= if ($totalcpu < $WINDOW * $DECLINE_CPU_PERCENT) { =73= ## $log->notice("throttle: $host got $totalcpu/100 CPU in $WINDOW secs"); # DEBUG =74= unlink $blockfile; =75= return OK; =76= } =77= =78= ## about to be nasty... let's see how bad it is: =79= open $fh, "/proc/loadavg"; =80= chomp(my $loadavg = <$fh>); =81= close $fh; =82= =83= my $useragent = $r->header_in('User-Agent') || "unknown"; =84= =85= $log->notice("throttle: $host got $totalcpu/100 CPU in $WINDOW secs, enabling block [loadavg $loadavg, agent $useragent]"); =86= open $fh, ">$blockfile"; =87= close $fh; =88= =89= return OK; =90= }); =91= =92= ## back in the access handler: =93= =94= if (my @stat = stat($blockfile)) { =95= if ($stat[9] > time - $WINDOW) { =96= $log->warn("throttle access: $blockfile is blocking"); =97= $r->header_out("Retry-After", $WINDOW); =98= return 503; # Service Unavailable =99= } else { =100= ## $log->notice("throttle access: $blockfile is old, ignoring"); # DEBUG =101= return DECLINED; =102= } =103= } =104= =105= return DECLINED; =106= } =107= 1; =0= ################ LISTING TWO ################ =1= #!/usr/bin/perl -w =2= use strict; =3= =4= # $Id$ =5= =6= my $DIR = "/home/merlyn/lib/Apache/Throttle"; =7= my $SECS = 360; # more than Stonehenge::Throttle $WINDOW =8= =9= chdir $DIR or die "Cannot chdir $DIR: $!"; =10= opendir DOT, "." or die "Cannot opendir .: $!"; =11= my $when = time - $SECS; =12= while (my $name = readdir DOT) { =13= next unless -f $name; =14= next if (stat($name))[8] > $when; =15= ## warn "unlinking $name\n"; =16= unlink $name; =17= }