Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Perl Journal magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
![]() |
Download this listing! | ![]() |
![]() |
![]() |
Perl Journal Column 01 (May 2003)
[suggested title: 'Watching log files']
One of the most boring tasks I perform as a system administrator is watching a log file. It's like watching a pot of stew boil. Or ice melt.
But watching a log file is also a great way to get a quick view of the
activity on the system. For example, watching an Apache access log is
great to see how ``bursty'' the traffic can be. But unless you're
staring at the tail -f
on the log file constantly, or you can pick
out the timestamps of each logged line and do the math in your head
quickly, you can't really tell which lines were ``together'' in a
particular burst.
So, one day recently I got the idea for a ``web tail'' program. A browser would fetch data from a custom web server, which would be watching the log file, noting the arrival time of each line. The display would show the last, say, 16 lines of log, but colorized in such a way that I could see the age of each line at a glance. That way, lines that arrived at similar times would have a similar color coding.
And then I stumbled across two pieces of the POE
(Perl Object
Environment) package that would make writing such a program easier.
POE is a great collection of tools to manage events and ``threaded''
Perl code, good for servers, clients and event-based glue. POE is
described in more detail at http://poe.perl.org
.
And the result is in [listing one, below].
Line 1 is the standard hash-bang line for Perl on my system. Obviously, this will need to be changed depending on where Perl is installed.
Line 2 turns on the standard restrictions: symbolic references are
disabled, implicit package variables are disabled (forcing most
variables to be declared with my
), and ``bare words'' are disabled.
This is always a good thing for programs exceeding a dozen lines or
so.
Lines 5 through 8 are the tweakable parts of this script. The
$FILENAME
gives the file to be followed. In this case, it's the
Apache access log for my web-server front-end reverse proxy. As I was
testing this script, I found that to be a nice source of bursty data.
The $TAILING
variable controls how many lines of the logfile are
visible. Too little, and we lose context. Too much, and we scroll
too often.
The $URGENT
and $OLD
variables both define the number of
seconds. Data that has arrived within $URGENT
seconds is always
pure white on the output. As it passes that threshold, it gets light
green through dark green, maximizing the darkest pure green at
$URGENT
plus $OLD
seconds. In this case, I'm letting it fade to
dark green in about a minute. Again, the values will depend on the
data being watched. I get about 20,000 hits a day on the web server,
so these particular settings showed lots of variations between
everything being all white, and all green. At a glance, I could
quickly tell how busy the recent traffic was being.
Line 11 defines the rolling data buffer, holding the most recent
$TAILING
events. Items are pushed onto the end, and shifted off
the front as needed. Each element of the array is a two-element array
ref, holding the Unix epoch-based integer time value when the data was
added, and the line itself (without a newline, apparently).
Line 12 pulls in the HTML-generating shortcuts from the core CGI
module. Although we actually aren't in any kind of CGI mode here, I
find the HTML shortcuts to be easier than typing a lot of angle
brackets myself.
Line 13 brings in the POE
pieces. Arguments passed to use POE
act like I had said:
use POE; use POE::Component::Server::HTTP; use POE::Wheel::FollowTail;
That's a convenient convention with POE
's non-standard import list.
Lines 15 through 54 create the web-server side of the POE process,
using the POE::Component::Server::HTTP
module (often referenced as
PoCo::HTTPD
in brief). I lifted this example almost directly
from the POE Cookbook, available in the Wiki at http://poe.perl.org
.
Line 17 defines the port number of the web server. By default, the web server listens on all IP addresses at this port number. I picked 42042 as an easy number to remember. You'd definitely want to check to ensure that such a port was not already in use by another application.
Lines 18 to 53 define the only ContentHandler
for this server. Any
request to any URL below slash (therefore, any request) gets
directed to this handler. The two parameters to the subroutine are
the request and response objects. The handler is responsible for
examining the request object, then updating the response object as
needed, and then returning the appropriate HTTP status code. Because
we are returning the same contents regardless of the request, no
checking of the request is performed.
Line 22 sets the response to the HTTP OK
value of 200. Line 23
sets the content type of the response to text/html
, indicating a
typical HTML response.
Lines 24 to 50 generate the content for the response, using the
CGI.pm
's HTML shortcuts. The join
in line 25 turns the many
pieces into one string. I wasn't sure if I needed to do that by
looking at the manpage for HTTP::Message
, but I knew it couldn't
hurt.
Lines 26 to 30 generate the HTML head and title section, including the
very important ``meta-refresh''. This refresh will cause most modern
browsers to reload the data repeatedly every five seconds. The
#bottom
anchor will ensure that most browsers also scroll to the
bottom anchor (defined in line 49). The background of the window is
set to black, which seemed to work out best during testing.
Lines 31 to 48 generate the bulk of the output: a single table. Each
row of the table is one of the $TAILING
lines, and consists of two
table cells. The left cell is the timestamp (in the localtime of the
web server) in a nice cyan lettering, while the right cell is the log
file line.
The log line is in Courier font, selected explicitly. During testing,
I tried putting it inside a pre
element, but that kept the line
from wrapping, making it difficult to read very wide log lines.
The lines of the table come from a map
, which processes the
elements of @data
(line 48), invoking the block defined in lines 32
to 47 for each element. Line 32 extracts the timestamp and the log
line itself, by dereferencing the $_
(the current element of
@data
) as an arrayref.
Lines 33 to 42 figure out what color a line that was seen at $stamp
time should be. First, the age of the line is computed in line 33.
If the age is below $URGENT
seconds (line 35), then the color is
forced to white
. Otherwise, the age is scaled by subtracting
$URGENT
(line 38) and maximized to be no more than $OLD
seconds
(line 39).
Line 40 computes a $c
value that will be 0 for the oldest lines and
255 for the very newest. Line 41 uses this value twice to create a
color that scales from #ffffff
(white) down to #00ff00
(pure
green) linearly. Thus, lines that arrived at identical times have
identical values, and lines that arrived at differing times generally
can be rapidly visually distinguished.
Once we have a color, we can construct the table row, starting in line
43. Lines 43 through 45 create the timestamp. The valign
parameter is set to top
so that the timestamp always appears at the
start of the log line, even if the log line wraps. The font
element is used to assign the color cyan
to the timestamp. Line 45
uses a sprintf
to create a zero-padded time value. The
localtime
operator is called in a list context for a list slice,
and the hours, minutes, and seconds values are selected in the proper
order for the sprintf
. Nice trick.
Lines 46 and 47 create the table cell for the log line. The line is
HTML-entitized using escapeHTML
, ensuring that less-thans don't
ruin my day. I can just imagine someone deliberately visiting a URL
of <BLINK>
just to mess up the rest of my display. But on a
more serious note, we don't want to open ourselves up to a
cross-site-scripting attack either. Line 46 uses another font
element to define the proper color and font face for the text.
Line 49 adds an anchor to the bottom of the display, which will be made visible by most browsers because of the address of the meta-refresh. This ensures that we are looking at the most recent data (at the bottom), even if the data exceeds the window size.
Finally, the closing body and html tags are created with line 50.
Once the content of the response has been established, we return from the subroutine with an OK code again, thus passing back the proper HTTP status to the browser.
That wraps up the web-server side of the process. As a browser
connects to port 42042, this code gets activated, the current @data
gets filtered and formatted, and the response is returned to the
browser. The meta-refresh causes the browser to refetch the same URL
five seconds later, with the updated response sent to the browser
again.
But how does @data
change? Well, within the same process, we also
have a POE::Session
object that is running a POE wheel. This
session is created in lines 57 to 75. A POE wheel defines a
correlated set of states and events for a higher-level function. This
particular wheel knows how to ``tail'' a file, generating events when
new data is available.
Line 59 declares that our session will have ``inline states'', one of the most common ways to create a session. We could also have ``object'' states or ``package'' states, but this seems to work the easiest for this example.
When the session is first started, the _start
state is called.
We'll define this start handler as an inline coderef starting in line
61. The start handler creates a POE::Wheel::FollowTail
wheel, and
stores it into the session's ``heap'' area, ensuring that it stays alive
as long as the session is alive. Because we never destroy this
session, this essentially means ``forever''. The heap key of wheel
is arbitrary: we could have used mongoose
with no consequences.
The wheel takes a few parameters, defined in lines 65 through 67. The
Filename
parameter is the file to be tailed. Here, we get the name
directly from the configuration parameter $FILENAME
defined at the
top of the program. Because we've provided an explicit filename, this
wheel is smart enough to close and re-open the file when the file
finally changes, such as when we roll the web logs (like I do
nightly). That's a lot better than my tail -f
sessions, which seem
to freeze around 2am, until I remember that the logs have been rolled!
The InputEvent
defines a callback event that will be triggered
whenever new data is available from the file. It appears to be called
once per line, although that wasn't clear in the documentation.
Finally, SeekBack
defines how many bytes of the file are initially
processed, by seeking back from the end of the file. Note that this
is a byte count and not a line-number boundary, so we might end up in
the middle of the line for the first entry.
As new lines arrive (or initially while processing the seekback tail
of the file), the event handler got_line
is called, defined
starting in line 70. Line 71 simply takes the line from $_[ARG0]
,
prepends the current timestamp, and pushes the resulting arrayref onto
the end of @data
. Line 72 ensures that the array never exceeds
$TAILING
elements by trimming elements from the beginning of the
array when necessary.
Finally, since POE is an event loop manager, we need to put the POE kernel ``in charge'' of the program. This is done in line 77. In this case, the loop never exits, because we always have a web server session and a followtail session, so the program never returns from this call.
So there you have it... a proof-of-concept program to inspire you to
write more interesting and cool things with POE
, especially now
that you've seen how easy it is to create a tiny web server. Until
next time, enjoy!
Listings
=1= #!/usr/bin/perl =2= use strict; =3= =4= ### configuration items ### =5= my $FILENAME = "/web/stonehenge-proxy/var/log/access_log"; =6= my $TAILING = 16; =7= my $URGENT = 5; # this many seconds stays white =8= my $OLD = 55; # this many seconds after urgent is dark =9= ### end configuration items ### =10= =11= my @data; =12= use CGI qw(:html); =13= use POE qw(Component::Server::HTTP Wheel::FollowTail); =14= =15= POE::Component::Server::HTTP->new =16= ( =17= Port => 42042, =18= ContentHandler => =19= {"/" => sub { =20= my ($request, $response) = @_; =21= =22= $response->code(RC_OK); =23= $response->content_type("text/html"); =24= $response->content =25= (join "", =26= start_html( =27= -title => "web tail on $FILENAME", =28= -bgcolor => 'black', =29= -head => ["<meta http-equiv=refresh content='5;URL=#bottom'>"], =30= ), =31= table (map { =32= my ($stamp, $message) = @$_; =33= my $age = time - $stamp; =34= my $color; =35= if ($age < $URGENT) { =36= $color = 'white'; =37= } else { =38= $age -= $URGENT; =39= $age = $OLD if $age > $OLD; =40= my $c = 255 * (1 - $age / $OLD); =41= $color = sprintf '#%02x%02x%02x', $c, 255, $c; =42= } =43= Tr(td({valign => 'top'}, =44= font({color => 'cyan'}, =45= sprintf "%02d:%02d:%02d", (localtime $stamp)[2,1,0])), =46= td(font({color => $color, face => 'courier'}, =47= escapeHTML($message)))); =48= } @data), =49= a({name => "bottom"}), =50= end_html, =51= ); =52= return RC_OK; =53= }}, =54= ); =55= =56= =57= POE::Session->create =58= ( =59= inline_states => =60= { =61= _start => sub { =62= $_[HEAP]->{wheel} = =63= POE::Wheel::FollowTail->new =64= ( =65= Filename => $FILENAME, =66= InputEvent => 'got_line', =67= SeekBack => 8192, =68= ); =69= }, =70= got_line => sub { =71= push @data, [time, $_[ARG0]]; =72= splice @data, 0, -$TAILING if @data > $TAILING; =73= }, =74= }, =75= ); =76= =77= $poe_kernel->run();