Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in WebTechniques magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
![]() |
Download this listing! | ![]() |
![]() |
![]() |
Web Techniques Column 50 (Jun 2000)
[suggested title: Self-registering password protection, part 2]
Last month, I introduced a mod_perl
authentication and
authorization module that permitted an extended format
htpasswd-like file. The purpose of the extensions were to
associate an email address with each user, and give a list of ``keys''
that the user had in his permission. Then the .htaccess
files
would refer to a series of ``locks'' needed for access to a particular
file or directory, and access was granted only when the user had the
keys for all the locks.
But the cool part of the module was that we redirected the failed authentication or authorization to a CGI URL. The intent of this URL was to give the user a chance to create their own username/password pair, provided that their email address was an address we already recognized. For example, if the authorization file is:
merlyn@stonehenge.com admin,stoners,perl merlyn xyF9kYWtJIFZ6 larry@wall.org perl lwall lwUHddn0dCD1I fred@flintstone.comm stoners barney@rubble.nett stoners
then Larry Wall and I are both already permitted (with the
corresponding username and encrypted password), but the user
fred@flintstone.comm
will be redirected to this CGI script (because
they don't have an active login). Now it's time to present the CGI
handler to conclude the picture.
But before I do that, I have to confess. I blew it. Line 46 of last month's listing looks like:
my $error_uri = Apache::URI->parse($r, $FAIL);
when really it needs to be something like:
my $error_uri = Apache::URI->parse($r); $error_uri->path($FAIL);
So much for believing the documentation! (Just kidding, Doug.)
Without this change, this month's CGI script doesn't know where it
lives in the URL-space, and doesn't give a correct self-referencing
URL as the form action. As you might guess, I sent last month's
column off to bed before thoroughly testing (or even beginning to
write) the CGI script. Oh well, my $bad
.
The other thing I didn't realize until I finished this month's code is that it's h-u-u-g-e by comparison to many of my other CGI programs, weighing in at 224 lines. Because of that, I've only got room here to hit the highlights, and won't do my usual line-by-line attack.
Line 1 turns on taint mode, and because I use a child process later, I
set the PATH
in line 4 to keep taint mode from failing.
Line 6 is a reminder that this script won't work if I move it to
/perl
(my Apache::Registry
area) because I use nested
subroutines and file-scoped my
variables -- a deadly combo with
that module. I could have written this to be a mod_perl
module,
but I expect it to be invoked so infrequently that I wouldn't have
saved any CPU time (at least not compared to the human time it would
take to work that out). (Find out more about that in the mod_perl
guide linked from perl.apache.org
.)
Lines 10 through 12 are the configuration sections. $DIR
must be
the same as the module's value for the Unix path to the directory
where the password files are stored. $ADMIN_EMAIL
is the address
shown on the web page for problems or questions, and mail will also be
sent from this address for the verification of the email address.
Lines 22 to 36 define the top-level algorithm. Inside an eval
block
(in case we die
somewhere), I first connect to the database
specified by the realm
parameter. Then, I see if I'm trying to
verify a URL that looks like:
http://www.stonehenge.com/cgi/restricttolist?realm=abc&verify=abc123456
that I've sent to the user in email. If not, perhaps I'm trying to
process form data. Either of those routines can return a message. If
the message begins with FORM:
, I send a new form to the user.
Otherwise, it's just a status message and I show it.
If there were any near-fatal errors in that section, I show the
``internal error'' message in lines 32 to 35, and also log the real
error message to the web server log. The warn
dumps to STDERR
,
which hopefully is still attached to the log.
Lines 47 to 64 dump the form, if needed. The incoming message in line 48 is shown at the top. The email address, requested username, and password fields are displayed in a nice table for layout, along with some descriptive comments. The realm name is also included as a sticky hidden field.
Lines 66 to 91 handle the form response, including kicking out the
form values for various reasons. If the given email address is not
one of the ones we're expecting, or already registered, then we bail
early. Otherwise, we're looking at a potential user to add, and we'll
verify the requested username (not necessarily correlated with the
email address) and password. The username has to be a simple
alpha-numeric, but the password can be any arbitrary gobbledygook, as
long as it's long enough. If we make it through all the hoops, we
jump into the add_user
routine.
Lines 93 to 102 take an email address, a requested user name, and a
clear-text password and add it to the data file. To ensure that we're
talking to the right user, we'll create a
very-difficult-to-guess-or-hack hex string using the Digest::MD5
module (from the CPAN). $hexhash
ends up as a 32-character hex
string based on the process ID, the time of day, a constant string,
and a 32-bit integer based on the current processes running on the
system. Probably not enough for some three-letter government agency
to consider secure, but close enough for this application.
After that, we'll update the database password field to be the crypted
password (using the first two random hex characters as a salt)
followed by a dash and the random 32-character hex string. The
combination will prevent the mod_perl
module from successfully
authenticating to this string, but nevertheless let us record what we
need.
Next, the hex string is used to create a URL to be mailed to the user,
handled by the subroutine in lines 104 to 135. Here, I'm using the
basic Net::SMTP
module (out of the libnet
distribution in the
CPAN) to attach to the localhost
mailer for delivery. I'm
presenting the link in two different ways in the mail because some
mail readers interpret the URL only when enclosed in the angle
brackets, while others simply display that as text.
The mail contains a URL of the form:
http://www.stonehenge.Xcom/cgi/restricttolist?realm=abc&verify=abc123456
where the text after verify=
is the 32-character hex string
generated by the random function. Once a user gets the mail and
follows this link, they end up back at this script in the
handle_confirm_request
subroutine of lines 137 to 147. This
confirms that it's indeed the user that we believe it is, and we can
go ahead and enable the password. The unique hexstring serves as the
key to locate the appropriate email address in the database. In line
142, I strip out the hex string from the encrypted password, leaving
just the 13-character output of crypt
from earlier. This then gets
rewritten back into the database, and we're done. Any false moves and
we get the message in line 146, and nothing changes.
Lines 149 to 159 get and cache the realm value. This is the ``basic auth'' realm, and is also the basename of the file holding the database, once we've stripped out any non alphanumeric characters. Because this originally comes from a form, we must untaint the value before using it as an output filename, hence the taint steps.
Lines 161 to 224 form a collection of routines to get, update, query, and store the database. The shared private data for this section is defined in lines 162 and 163. Lines 165 to 188 open the database file exclusively, and load up the internal hash information for quick access.
Lines 168 to 175 are the trickiest part of this. Because we want the
database file to be usable by the webserver at all times, we must use
a ``write a temp file then atomically rename'' operation to update the
file to contain new information. However, this method doesn't work
well with flocking, because an exclusive flock may indeed be acquired
on a file that has now been deleted as the result of the previous
holder now renaming a new file over the top! So, it's messy. We open
the filehandle, and execute the exclusive flock. If we get it, we
then also verify that the filehandle we hold is the same file as the
full pathname we can stat
, by comparing the device and inode
numbers of the two items. If they're not the same, we release the
handle, which is now probably a file that is completely unlinked, and
start over. Eventually, we'll hold an exclusive flock on a filehandle
that is also the named file, and we can move on. Another approach
would be to do a non-blocking flock, and if unattained, simply sleep
for a moment and retry. On second thought, that sounds simpler.
Lines 177 to 188 parse the data. The majority of the database is held
in (and recreated from) the %info_for
hash, but we also flag all
the email addresses and usernames in additional hashes for quick
lookup. The keys of the %info_for
are lowercased to allow
case-insensitive email address lookup, although the data of the first
field preserves the case, in case the case matters.
Lines 196 to 215 write out the data. We're presuming that we'll do
this (and get called) only once per invocation, because any update is
also an immediately flush and release. We'll write a temporary file
in the same directory as the database, but with .tmp
appended to
the name, and then give that file the same permission bits. Then, the
new updated data gets written (in sorted email address order, so that
may not be the same as the original order). And finally, the
rename
moves the file into place as the new active database, ready
to be authenticated and authorized on the next webserver hit.
The last two subroutines access some of the database data in an
orderly manner, needed because the data is not visible outside the
BEGIN
block.
Whew! That's a long one. But it also does a lot of things. But the basic strategy is simple:
-
If a user that needs access comes to the protected area, and cannot provide a valid username/password, they get redirected to this script, with the proper realm passed as a parameter.
-
On the first invocation (or if an error occurs), the form is displayed, and the user fills in their email address, the username they want, and the password they want.
-
If the email address is in the database, the username/password is added (but not activated), and a random string for verification gets generated in email to the user
-
The user waits for the email, and follows the included URL
-
The script, upon verifying that the user got the email via the random string, activates the selected username and password.
-
The user can then go back to the protected area with an active username and password, and we have unique logging and access control.
So, install this script at the URL defined by $FAIL
in last month's
column, and you'll have a fully working system (presuming you did all
the things from last month as well). And next time I write a two-part
column, I'll be sure to test both parts before submitting the first
one, I promise! Until next time, enjoy.
Listings
=1= #!/usr/bin/perl -Tw =2= use strict; =3= $|++; =4= $ENV{PATH} = "/bin:/usr/bin"; =5= =6= ## do not run this under Apache::Registry (nested subroutines abound) =7= =8= ## config =9= =10= my $DIR = "/home/merlyn/Web/RestrictToList"; =11= my $ADMIN_EMAIL = 'webmaster@stonehenge.comm'; =12= my $ADMIN_HUMAN = 'Randal L. Schwartz'; =13= =14= ## end config =15= =16= use CGI qw(:all); =17= use IO::File; =18= =19= print =20= header, start_html("Web access request"), h1("Web access request"); =21= =22= eval { # death if unexpected things happen =23= connect_to_database(); =24= my $message = handle_confirm_request() || handle_form_response(); =25= if ($message =~ /^FORM:\s+(.*)/s) { =26= show_form($1); =27= } else { =28= print p($message); =29= } =30= }; =31= if ($@) { =32= warn $@; # to web error log =33= print =34= h1("An internal error has occurred"), =35= p("Please contact me at once, and describe what you did."); =36= } =37= =38= print =39= h1("If you have any questions..."), =40= p("Please contact me at", a({href => "mailto:$ADMIN_EMAIL"}, $ADMIN_EMAIL)), =41= end_html; =42= =43= exit 0; =44= =45= ## only subroutines from here down =46= =47= sub show_form { =48= my $message = shift; =49= =50= print =51= hr, p($message), start_form, =52= table({cellspacing => 0, cellpadding => 2, border => 1}, =53= Tr(th("Your email address"), =54= td(textfield(-name => 'email', -size => 50)), =55= td("just the USER\@HOSTNAME.COM part, please")), =56= Tr(th("Your preferred username"), =57= td(textfield(-name => 'username', -size => 50)), =58= td("letters and digits only, at least 5 characters")), =59= Tr(th("Your preferred password"), =60= td(password_field(-name => 'password', -size => 50)), =61= td("at least 5 characters")), =62= ), =63= submit, hidden('realm'), end_form, hr; =64= } =65= =66= sub handle_form_response { =67= my $form_email = param('email') =68= or return "FORM: Please give your email address."; =69= my $info = info_for_email($form_email) =70= or return "FORM: I don't recognize that email address, try again."; =71= if (my $user = $info->{user}) { =72= if (my $password = $info->{password}) { =73= if ($password =~ /-/) { # needs to be verified =74= return "You need to check your email for further instructions."; =75= } =76= } =77= return "You are already registered!"; =78= } =79= ## OK, it's a new user with a good email - let's make a login =80= my $form_user = param('username') =81= or return "FORM: Please give an access username for $form_email."; =82= $form_user =~ /^\w{5,20}$/ =83= or return "FORM: Please use 5 to 20 letters or digits in the username."; =84= seen_user($form_user) =85= and return "FORM: Sorry, that username is taken. Try another."; =86= my $form_password = param('password') =87= or return "FORM: Please give a password for $form_user."; =88= length($form_password) >= 5 =89= or return "FORM: Please use at least 5 characters in the password."; =90= return add_user($form_email, $form_user, $form_password); =91= } =92= =93= sub add_user { =94= my ($email, $user, $password) = @_; =95= =96= require Digest::MD5; =97= my $hexhash = Digest::MD5::md5_hex($$, time, "sekret kode", =98= unpack "%L*", `ps axww`); =99= update_info_for_email($email, $user, =100= crypt($password, $hexhash) . "-$hexhash"); =101= return send_email($email, $hexhash); # DEBUG =102= } =103= =104= sub send_email { =105= my ($email, $hexhash) = @_; =106= =107= my $confirm_url = url()."?realm=".get_realm()."&verify=$hexhash"; =108= =109= require Net::SMTP; =110= my $mail = Net::SMTP->new('localhost') or die "Cannot open mail: $@/$!"; =111= $mail->mail($ADMIN_EMAIL); =112= $mail->to($email); =113= $mail->data(<<END); =114= To: $email =115= From: "$ADMIN_HUMAN" <$ADMIN_EMAIL> =116= Subject: confirming your web access request (PLEASE READ) =117= =118= Please visit the following URL: =119= <URL:$confirm_url> =120= =121= If the link above is not a clickable link, visit this location: =122= =123= $confirm_url =124= =125= Please copy the link into your web browser's "location" or "URL" =126= field carefully. =127= =128= If you have any questions, please reply to this email. =129= =130= Thank you, =131= $ADMIN_HUMAN <$ADMIN_EMAIL> =132= END =133= =134= return "Look for email sent to ".tt($email)." for further instructions."; =135= } =136= =137= sub handle_confirm_request { =138= my $form_verify = param('verify') or return; =139= if (my $email_from_hexhash = seen_hexhash($form_verify)) { =140= my ($email, $keys, $user, $encrypted_password) = =141= @{info_for_email($email_from_hexhash)}{qw(email keys user pw)}; =142= $encrypted_password =~ s/-.*//; =143= update_info_for_email($email, $user, $encrypted_password); =144= return "Your access has been verified!"; =145= } =146= return "Please be sure to copy the URL very carefully from the email."; =147= } =148= =149= BEGIN { # realm holder =150= my $realm; =151= =152= sub get_realm { =153= return $realm if $realm; =154= $realm = param('realm') or die "Missing Realm"; =155= $realm =~ tr/A-Za-z0-9//cd; =156= $realm =~ /(\w+)/ or die "empty realm"; =157= return $realm = $1; # untaint =158= } =159= } =160= =161= BEGIN { # database things =162= my ($filename, $db_handle); =163= my (@data, %info_for, %seen_user, %seen_hexhash); =164= =165= sub connect_to_database { =166= $filename = "$DIR/".get_realm(); =167= { =168= my $h = IO::File->new("< $filename") or die "cannot open $filename: $!"; =169= flock($h, 2); =170= ## because this file might be renamed underneath us: =171= my @handle_stat = stat $h; =172= my @file_stat = stat $filename; =173= redo if $handle_stat[0] != $file_stat[0]; =174= redo if $handle_stat[1] != $file_stat[1]; =175= $db_handle = $h; # fall out =176= } =177= for (@data = <$db_handle>) { =178= my ($email, $keys, $user, $pw) = split; =179= for ($info_for{lc $email} ||= {}) { =180= $_->{email} = $email; =181= $_->{keys} = $keys; =182= $_->{user} = $user if $user; =183= $_->{pw} = $pw if $pw; =184= } =185= $seen_user{$user}++ if $user; =186= $seen_hexhash{$1} = $email if $pw and $pw =~ /-(.*)/; =187= } =188= } =189= =190= sub info_for_email { =191= my $email = shift; =192= =193= $info_for{lc $email}; =194= } =195= =196= sub update_info_for_email { =197= my ($email, $user, $encrypted_password) = @_; =198= =199= for ($info_for{lc $email}) { =200= defined $_ or die "shouldn't happen - undef lc $email info"; =201= $_->{user} = $user; =202= $_->{pw} = $encrypted_password; =203= } =204= my $tmp = "$filename.tmp"; =205= { =206= my $h = IO::File->new("> $tmp") or die "Cannot create $tmp: $!"; =207= chmod +(stat($db_handle))[2], $tmp or warn "Cannot chmod $tmp: $!"; =208= for my $email (sort keys %info_for) { =209= print $h join(" ", map { =210= exists $info_for{$email}{$_} ? $info_for{$email}{$_} : () =211= } qw(email keys user pw)), "\n"; =212= } =213= } =214= rename $tmp, $filename or warn "Cannot rename $tmp to $filename: $!"; =215= } =216= =217= sub seen_user { =218= $seen_user{+shift} ? 1 : 0; =219= } =220= =221= sub seen_hexhash { =222= $seen_hexhash{+shift}; # returns email key =223= } =224= }