Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in WebTechniques magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
![]() |
Download this listing! | ![]() |
![]() |
![]() |
Web Techniques Column 61 (May 2001)
[suggested title: Basic Cookie Management]
Ahh, cookies. One of my pet peeves is the amount of bad cookie code I see out there, including the reaction that a website gives me when I choose not to permit cookies (usually because I'm feeling rebellious).
Cookies are one of many ways to turn the stateless HTTP into a stateful session-based series of transactions. (Some of the others include using some sort of authentication, or mangling the URLs, or including hidden data in forms.) But cookies get my ire because many web programmers presume that ``one user is one browser'', because that's the basic model of the cookie itself as well. That's demonstratably completely untrue. I myself have three different browsers open at the moment, and I have been known to go into an ``internet cafe'' from time to time to use the browsers supplied there. While I personally move from one browser to another, my cookies don't follow me!
The wrong way to use cookies, therefore, is to have a login form, and on successful login, send out a cookie that lasts until year 2003 to that browser. That's bad. I can't login on another browser, and if I forget to logout of a browser at an ``internet cafe'', the next user who stumbles across the same website is (gasp!) already logged in as me!
Another wrong way to use cookies is to send out a bunch of data in a cookie, like the entire contents of the shopping cart. I say wrong because most people who do this seem to trust the data as it's being returned on the next hit, and nothing stops me from changing the price of that $300 item I just bought to $1 instead, if it's all coming from the cookie.
Still another wrong way to use cookies is to send dozens of cookies, like one for each graphic. Goodness knows, I've been to some sites and had to accept a baker's dozen of cookies before I even see the entire page.
And yet another wrong way is to let the cookie's expiration time serve as the security policy for timing out an active user. A browser does not have to respect any expiration times. Do not count on that.
And even worse, sometimes I've seen servers go into infinite loops checking for cookies to be set, and redirecting if the cookie is not set, never telling the user why things are awry.
Can you tell I've seen a lot of bad cookie code? Do you now understand why the hairs generally stand on the back of my neck when someone mentions ``I need cookies for this application''? Well, then read on.
There is a reasonably safe way to use cookies. Use cookies only to brand a particular browser, and only for the duration of a browser session. The cookie should be a single small cookie with a short but unguessable value (such as the MD5 hash of some cryptographically strong material). Then, this particular ``branded'' browser will be sending back this cookie only while it is currently open.
Next, take the brand-mark, and use it to key into a database to lookup a particular user for that branded browser. The database should have a timestamp of recent activity, and be distrusted after the timeout period.
Finally, use the verified user value to key into another database for session information, like a shopping cart or personal preferences. Don't use the browser-brand value for anything other than a one-step mapping to a user, because otherwise the user cannot migrate her session over to a new browser without restarting some of the transaction, and that's annoying. (In fact, you should probably permit the same user to log in on multiple browsers simultaneously.)
Sounds hard? Naah. It's just a few dozen lines of Perl code. How do I know? I hacked it out just recently. And I present this sample reference implementation of this strategy in [Listing one, below]. Please keep in mind that this is not a complete application: just the part that handles the ``what user is logged in to this browser?'' question.
Lines 1 through 3 start nearly every program I write, turning on taint
mode (good for CGI programs), warnings (good for catching stupid
mistakes), compiler restrictions (good for catching more stupid
mistakes) and disabling buffering on STDOUT
(good for CGI
programs).
Line 5 pulls in the veritable CGI.pm
module, including all the
function shortcuts.
Lines 7 to 29 handle the ``branding'' of a particular browser with a
unique cookie. Keep in mind that this has to be done before we've
sent anything to standard output, because we may need to issue
a new Set-Cookie
header, or perhaps a redirect to ourselves as
a cookie test.
Line 8 fetches the browser
cookie, if any. If present, $browser
is now a unique string (actually, an MD5 signature of some unique
data). However, if it's absent, we've got some work to do to make
this browser our own.
Lines 9 and 10 recognize the common case, after this program has been
invoked once: namely, that we've got a good browser ID. The
_cookiecheck
parameter is described later, but we must make sure
it's out of the mix for later code.
If we had no cookie in line 9, then we have two possibilities: either
the cookie had never been sent, or the browser refused to send it
back. In either case, we first prepare a potential new cookie using
lines 12 though 15. The MD5
module (found in the CPAN) allows us
to create a 32-character hex string from a given arbitrary data item.
In this case, we're using the time of day, a random number, the
process ID, and the stringified hashref of a newly created throwaway
hash, simply as icky glue.
This is not as secure as using cryptographically strong items: there
are modules in the CPAN to make it harder to guess. However, this
code was lifted directly from Apache::Session
, a well-known
chunk of code to handle session management, so I feel confident
knowing I can at least blame someone else.
Line 17 distinguishes whether this is a first invocation rather than
an invocation where we've had at least one chance to set a cookie (and
was therefore refused). If _cookiecheck
is defined, we've had at
least one try to get it right, so we dump out an HTML page (lines 19
to 23) stating our demands. We also try setting a cookie one more
time; maybe the user will get tired of saying ``reject this cookie'', or
maybe they just didn't like that particular hex string (who knows?).
The form submission in line 22 will cause us to come back to the same
page, but with _cookiecheck
possibly still set. (If not, then
we'll get two hits to get back to here again, just as when we
started.)
If this is the first visit, then _cookiecheck
will not be set, so
we set it in line 25, and do an external redirect to ourselves to
verify the cookie is indeed present.
By the time we hit line 30, we've now branded the browser with a unique
cookie identification, and that's in $browser
.
The next step is to determine if this browser is ``logged in'' or not.
We'll keep track of that with a lightweight database, made possible
with the File::Cache
module from the CPAN. (Late-breaking news:
the author of this module has started to generalize the caching
structure into a separate Cache::Cache
module, so by the time you
read this, things might work differently, so beware.)
Line 34 ``opens'' the cache by creating a cache object in $cache
.
We'll set the cache items to expire within an hour, meaning that no
user can be logged in for longer than one hour of inactivity. You
might permit this to be longer or shorter (longer for low-risk items,
shorter for high-risk items), but one hour is a good starting point.
Lines 41 to 44 handle a small housekeeping chore for the cache. If a
user doesn't come back but hasn't logged out, her cached user ID still
exists as a file in the database directory (until the next time it is
fetched). But it most likely won't be fetched, since that cookie will
also expire when the browser is closed, so we've got a dead file
sitting around. Every 4 hours, the _purge_
entry will expire, so
we'll let the first lucky user who happens to invoke this program
right after that go through the cleanup process. This should be very
lightweight; if you're concerned about doing this at CGI time, you
could instead pull this out to a separate cron job (but be sure the
job runs as the web user, not as you).
Line 46 pulls out the user associated with this browser, if any. If
there's an entry in the cache, but it's older than an hour, the entry
is deleted, and we get back undef
, the same as if the entry doesn't
exist. So if there is a defined value here, it's current, and the
user is logged in as $user
. Otherwise, there's no user associated
with the browser uniquely identified with $browser
.
Lines 50 to 66 handle the transitions between logged in and logged
out. If the user is logged in and has requested a logout, lines 52 to
55 handle that. The parameter requesting logout is deleted (for
sticky forms), and the user is removed from the cache database.
$user
is also undefined to reflect this for the rest of the
program.
Lines 57 to 65 handle logging in. First, the requested username and
password are read. Next the username is checked for well-formedness
(which I've arbitrarily defined here as ``looks like a Perl
identifier''), and then we verify the correct password for this user by
calling verify
. I've defined a simple version for this down at the
bottom of the program in lines 98 to 101 that simply returns true if
the username is a substring of the password. Please don't use this in
real life: this is just a demo. If the password's good, $user
gets set; otherwise, we reject the attempt.
Lines 68 to 83 handle the actions useful within the current state.
For logged in users, we'll do a couple of things. Each time a
logged-in user returns to the page, we update the cache time in line
70, to permit her to stay logged in for another hour from now. Line
72 displays a simple ``log out'' form button, which reinvokes this same
program including a _logout
parameter. Recall that this parameter
was being tested up in line 51.
For logged out users, lines 74 to 82 display that status and present a simple login form with a submit button, using a table for layout. Please don't fault my lack of HTML design skills: I'm illustrating structure here, not my graphics aptitude which I admit is sorely lacking.
The code from line 85 downward would be where your real application
goes, using the code above as a framework. The rest of the
application could count on $user
to be the name of an authenticated
user logged into the browser of choice, and active within the past
hour. As a sample do-nothing application, I thought I'd leave in my
testing code that I used while developing this program to see what the
current cookies and parameters contained.
Lines 87 to 94 execute a loop twice: once with $title
set to
Cookies
and a $f
set to the coderef for the cookie
function
(provided by CGI.pm
), and a second time with Params
and the
param
function instead. I had originally written this as two
separate displays, but then writhed a bit at the similarity of the
code, which I then factored out and parameterized. Thank goodness for
coderefs.
Line 90 prints the second-level header for the title, then follows it
with a table containing the cookie or parameter keys in the first
column, followed by their value in the second column. Because both
cookies and parameters can be multivalued, I've added code to join
multiple values by commas (line 92). Also, since both the keys and
values can contain HTML-significant markup (less-thans, greater-thans,
ampersands, and so on), I pass the data through escapeHTML
(provided by CGI.pm
) before display.
Note that line 93 invokes the function (either cookie
or param
)
with no arguments to get a list of all things of that type, while the
end of line 92 invokes that same function passing it one item of that
type to get its value. It's very nice that they have that same
interface.
Lines 98 to 101 were described earlier, but this is also a part of the program you'd definitely want to rewrite for a real application.
So, in summary, cookies can be reasonable for session management, as long as the logged in state is clear, a logout button is clearly visible, the cookie expires when the browser is closed, and the session expires after an inactivity timeout value (typically an hour) is reached. Have fun handing out cookies, and don't forget the milk. Until next time, enjoy!
Listings
=1= #!/usr/bin/perl -Tw =2= use strict; =3= $|++; =4= =5= use CGI qw(:all); =6= =7= ## cookie check =8= my $browser = cookie("browser"); =9= if (defined $browser) { # got a good browser =10= Delete("_cookiecheck"); # don't let this leak further =11= } else { # no cookie? set one =12= require MD5; =13= my $cookie = cookie =14= (-name => 'browser', =15= -value => MD5->hexhash(MD5->hexhash(time.{}.rand().$$))); =16= =17= if (defined param("_cookiecheck")) { # already tried! =18= print +(header(-cookie => $cookie), =19= start_html("Missing cookies"), =20= h1("Missing cookies"), =21= p("This site requires a cookie to be set. Please permit this."), =22= startform, submit("OK"), endform, =23= end_html); =24= } else { =25= param("_cookiecheck", 1); # prevent infinite loop =26= print redirect (-cookie => $cookie, -uri => self_url()); =27= } =28= exit 0; =29= } =30= =31= ## At this point, $browser is now the unique ID of the browser =32= =33= require File::Cache; =34= my $cache = File::Cache->new({namespace => 'cookiemaker', =35= username => 'nobody', =36= filemode => 0666, =37= expires_in => 3600, # one hour =38= }); =39= =40= ## first, some housekeeping =41= unless ($cache->get(" _purge_ ")) { =42= $cache->purge; # remove expired objects =43= $cache->set(" _purge_ ", 1, 3600 * 4); # purge every four hours =44= } =45= =46= my $user = $cache->get($browser); ## either the logged-in user, or undef =47= =48= print header,start_html('session demonstration'),h1('session demonstration'); =49= =50= ## handle requested transitions (login or logout) =51= if (defined $user and defined param("_logout")) { =52= Delete("_logout"); =53= $cache->remove($browser); =54= print p("You are no longer logged in as $user."); =55= undef $user; =56= } elsif (not defined $user and defined (my $try_user = param("_user"))) { =57= Delete("_user"); =58= my $try_password = param("_password"); =59= Delete("_password"); =60= if ($try_user =~ /\A\w+\z/ and verify($try_user, $try_password)) { =61= $user = $try_user; =62= print p("Welcome back, $user."); =63= } else { =64= print p("I'm sorry, that's not right."); =65= } =66= } =67= =68= ## handle current state (possibly after transition) =69= if (defined $user) { =70= $cache->set($browser,$user); # update cache on each hit =71= print p("You are logged in as $user."); =72= print startform, hidden("_logout", 1), submit("Log out"), endform; =73= } else { =74= print p("You are not logged in."); =75= print =76= startform, =77= table({-border => 1, -cellspacing => 0, -cellpadding => 2}, =78= Tr(th("username:"), =79= td(textfield("_user")), =80= td({-rowspan => 2}, submit("login"))), =81= Tr(th("password:"), td(password_field("_password")))), =82= endform; =83= } =84= =85= ## rest of page would go here, paying attention to $user =86= =87= for ([Cookies => \&cookie], [Params => \¶m]) { =88= my ($title, $f) = @$_; =89= =90= print h2($title), table =91= ({-border => 0, -cellspacing => 0, -cellpadding => 2}, =92= map (Tr(th(escapeHTML($_)), td(escapeHTML(join ", ", $f->($_)))), =93= $f->())); =94= } =95= =96= ## sample verification =97= =98= sub verify { =99= my($user, $password) = @_; =100= return index($password, $user) > -1; # require password to contain user =101= }