Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in SysAdmin/PerformanceComputing/UnixReview magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.
Download this listing!

Unix Review Column 51 (Mar 2004)

[suggested title: Monitoring Net Traffic with OpenBSD's Packet Filter]

The server for stonehenge.com lives somewhere in Texas, in a place I've never seen. I rent a box from Sprocket Data Systems, and they provide my remote eyes and ears, and hook me up to their networks and power grid. I'm limited to a certain bandwidth each month for the rate I pay, and to offset the costs, I also sublease the box to geekcruises.com and redcat.com.

Because the bandwidth costs me actual dollars for usage and over-usage, I needed a way to monitor how much was being used, and by whom. Normally, this would be easy to solve if I controlled the upstream router for the box, but I didn't. But as I was setting up tighter security on my OpenBSD machine, I noticed that the Packet Filtering firewall software could give me back statistics on named rules. By naming the rules that pass traffic, I could query the pf subsystem frequently and get traffic data. Problem solved!

In the filtering section of my /etc/pf.conf file, the last dozen rules look like:

    pass quick on lo0 keep state
    pass in log label "other-inbound" keep state
    pass out log label "other-outbound" keep state
    pass in to <geekcruises> label "geekcruises-inbound" keep state
    pass out from <geekcruises> label "geekcruises-outbound" keep state
    pass in to <redcat> label "redcat-inbound" keep state
    pass out from <redcat> label "redcat-outbound" keep state
    pass in to <webstonehenge> label "webstonehenge-inbound" keep state
    pass out from <webstonehenge> label "webstonehenge-outbound" keep state
    pass in to <stonehenge> label "stonehenge-inbound" keep state
    pass out from <stonehenge> label "stonehenge-outbound" keep state

These rules use tables defined earlier to identify the CIDR blocks addresses of interest. For example, at the moment, the webstonehenge table is defined as:

    table <webstonehenge> { 209.223.236.163 }

The result of having the labels on these rules is that every time a conversation is started for my www.stonehenge.com address, all packets are ``charged to'' that particular rule. Because of keep state, all reply packets are also charged to that rule.

We can dump the counters by running pfctl -zvsl frequently, which results in an output something like:

    other-inbound 421 0 0
    other-outbound 421 0 0
    geekcruises-inbound 421 2430 754775
    geekcruises-outbound 421 0 0
    redcat-inbound 421 0 0
    redcat-outbound 421 0 0
    webstonehenge-inbound 421 1081 470209
    webstonehenge-outbound 421 0 0
    stonehenge-inbound 421 810 132237
    stonehenge-outbound 421 619 69223

The first of three numbers is the number of times the rule has been evaluated, the second is the number of packets, and the third is the number of bytes (our most important number). The -z flag on the command ``resets'' the counter, so the next execution will be all packets that had not yet been seen.

I wrote a program that runs from root's crontab every five minutes to execute this command and parse the data (trivial for Perl), and using DBI, insert it into a database for querying. I won't show that program, because it's pretty short. However, to keep things dumbly easy for me, I chose to use DBD::SQLite as my ``database''. This CPAN module contains an entire transaction-enabled SQL92-compliant database driver that accesses a single file for the database. It's very nice when you don't want to go the full distance for a complex application, and also very speedy (much faster than MySQL for the same application).

I used the SQLite database schema of:

    CREATE TABLE stats (
      stamp INT,
      type TEXT,
      packets INT,
      bytes INT,
      PRIMARY KEY (stamp, type)
    );

Because SQLite doesn't have direct date primitive support, I'm storing the timestamp as the Unix epoch value (obtained with time in my Perl program).

Once I had a few days worth of data, I wanted to see what it looked like. I could have taken an off-the-shelf solution like MRTG to do this, but after reviewing the complexity of the existing applications in this area, I decided to write something much simpler and more appropriate to my needs. And the result is in [Listing one, below].

Lines 5 through 36 define how the graph looks. First, in lines 7 through 9, I have the height and width in pixels, and number of seconds of history to examine. (Multiplying the number of days times 86400 keeps me from doing heavy math in my head.)

Lines 11 and 12 define the input database (SQLite keeps this as a single file) and the output graphic location, conveniently located inside my webserver's space so I can get to it with a browser.

Lines 14 and 15 define the number of bars and the number of labels, roughly proportional to the width of the graph.

Lines 16 to 34 control what gets displayed, and how it gets displayed. I don't want an output for every rule category (especially later when I start breaking out email versus ssh versus other), so I'll merge the types using the rules defined in MAPPING. The subroutine expects one of the category names as input, and returns the desired roll-up name. The list in @PLOT should correspond to those roll-up names in the order wanted, and the colors in @COLORS show how they're colored.

Once we've got a plan, it's time to implement that plan. First we'll connect to the database in lines 40 to 42. Then, in line 44, we'll provide that MAPPING subroutine directly to SQLite as an SQL function. This is very cool, because I can extend the SQL's functions and aggregates using Perl definitions.

Lines 46 to 48 compute the range of time values to be covered, including the step between output column values.

Line 50 sets up the result array that will be passed to GD::Graph, including the labels for the timestamps as the first row. The time_to_label subroutine is defined later.

Line 51 computes a mapping from the mapped roll-up names to their appropriate row number in the graph.

Lines 53 to 59 prepare and execute an SQL query to get our data summarized for output. Because SQLite doesn't have a floor function, I cheated by using the round function and subtracting 0.5.

Lines 60 to 66 grab the result, which is a series of rows with a graph column number (which I called $row because I was sideways), a rolled-up type, and the average number of bytes transferred during that time slot. Because the byte samples were taken during 5-minute intervals, I had to divide that number by 300 to get average bytes-per-second. But I also care a lot more about gigabytes per month than bytes per second, so I scaled the number appropriately. The results are inserted into @graphdata, autovivified as necessary.

Lines 68 to 72 define the time_to_label subroutine. The input is a Unix epoch value, and the output is a string of ``month/day-hour''.

Finally, the graphing part, beginning in line 74. Line 75 creates the graph object. Lines 76 to 85 define the specifications for the graph. Line 86 passes the roll-up names as the labels, and line 87 actually does all the hard work and creates a GD object.

Lines 89 to 94 write the GD object as a PNG file, taking care not to let a partially-written file be visible on the web page.

And the result of that is a pretty picture that shows my traffic, organized by customer, and graphed over time. [Editor: do we want an actual picture here? I can provide that.] In the few weeks that I've been gathering data, it's been useful to see exactly how my box is being accessed. And because bandwidth is a precious resource, I've got one more tool now to manage it. Until next time, enjoy!

Listings

        =1=     #!/usr/bin/perl -w
        =2=     use strict;
        =3=     $|++;
        =4=     
        =5=     ## start config
        =6=     
        =7=     my $HEIGHT = 500;
        =8=     my $WIDTH = 1000;
        =9=     my $TIMEFRAME = 86400 * 3;
        =10=    
        =11=    my $DATABASE = "/var/log/pfstats.sqlite";
        =12=    my $OUTPUT = "/web/htdocs/web-blue-stats2.png";
        =13=    
        =14=    my $COUNT = int($WIDTH/10);     # controls desired barsize
        =15=    my $LABELS = int($WIDTH/20);    # function of fontsize
        =16=    my @PLOT = qw(OTHER
        =17=                  stonehenge-inbound
        =18=                  stonehenge-outbound
        =19=                  webstonehenge
        =20=                  geekcruises
        =21=                 );
        =22=    sub MAPPING {
        =23=      local $_ = shift;
        =24=      s/^(webstonehenge|geekcruises)(-.*)?/$1/
        =25=        or s/^((stonehenge)-(in|out)bound)(-.*)?/$1/
        =26=          or s/.*/OTHER/;
        =27=      return $_;
        =28=    };
        =29=    my @COLORS = qw(pink
        =30=                    lorange
        =31=                    orange
        =32=                    lgreen
        =33=                    lblue
        =34=                   );
        =35=    
        =36=    ## end config
        =37=    
        =38=    require DBI;
        =39=    
        =40=    my $dbh = DBI->connect("dbi:SQLite:dbname=$DATABASE","","",
        =41=                           {RaiseError => 1, PrintError => 0})
        =42=      or die $DBI::error;
        =43=    
        =44=    $dbh->func("mapping", 1, \&MAPPING, "create_function");
        =45=    
        =46=    my $high = time;
        =47=    my $low = $high - $TIMEFRAME;
        =48=    my $gap = ($high-$low)/$COUNT;
        =49=    
        =50=    my @graphdata = [map time_to_label($low + $_ * $gap), 0..($COUNT-1)];
        =51=    my %graph_col_map = map {$PLOT[$_] => $_ + 1} 0..$#PLOT;
        =52=    
        =53=    my $sth = $dbh->prepare
        =54=      (q{SELECT round((stamp - ?) / ? - 0.5), mapping(type), avg(bytes)
        =55=         FROM stats
        =56=         WHERE stamp BETWEEN ? AND ?
        =57=         GROUP BY 1, 2}
        =58=      );
        =59=    $sth->execute($low, $gap, $low, $high);
        =60=    while (my ($row, $type, $bytes) = $sth->fetchrow_array) {
        =61=      my $graph_col_map = $graph_col_map{$type};
        =62=      $graph_col_map = 10 unless defined $graph_col_map;
        =63=      ## (bytes) / (300 seconds per sample) = Bytes per Second
        =64=      ## so multiply that by seconds per month divided by 1G
        =65=      $graphdata[$graph_col_map][$row] = ($bytes / 300) * (86400 * 31 / 1e9);
        =66=    }
        =67=    
        =68=    sub time_to_label {
        =69=      my $time = shift;
        =70=      my @values = localtime($time);
        =71=      sprintf "%02d/%02d-%02d", $values[4]+1, $values[3], $values[2];
        =72=    }
        =73=    
        =74=    require GD::Graph::bars;
        =75=    my $g = GD::Graph::bars->new($WIDTH, $HEIGHT) or die;
        =76=    $g->set(
        =77=            x_label => 'month/day-hour (pacific time)',
        =78=            x_label_skip => $COUNT/$LABELS,
        =79=            x_labels_vertical => 1,
        =80=            y_label => 'average gigabytes per month',
        =81=            y_long_ticks => 1,
        =82=            title => 'blue.stonehenge.com traffic by customer',
        =83=            cumulate => 1,
        =84=            dclrs => \@COLORS,
        =85=           );
        =86=    $g->set_legend(@PLOT);
        =87=    my $gd = $g->plot(\@graphdata) or die;
        =88=    
        =89=    open OUT, ">$OUTPUT.tmp" or die "$OUTPUT.tmp: $!";
        =90=    print OUT $gd->png;
        =91=    close OUT;
        =92=    
        =93=    rename "$OUTPUT.tmp", $OUTPUT or die "rename: $!";
        =94=    exit 0;

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.