Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in WebTechniques magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.
Download this listing!

Web Techniques Column 49 (May 2000)

[suggested title: Self-registering password protection, part 1]

For the most part, the web is about sharing. Sharing what you have with as many people as possible, all-comers accepted. But sometimes, you have stuff that you want to share with a smaller community of people. ``No problem'', I might say. My Apache web server has the ability to use ``basic authentication'', compatible with all the popular browsers, that allows me to restrict access to those who know the username and the password.

Well, ``yes problem'', I then say. I can give all members of my group the same username and password. But then when a member leaves, I have to update all the remaining members on a new password. And I can't tell if the web area is being used by everyone, or just a few, since the username is the only identifier I have. OK, then the alternative is to give each user their own password. Ugh. I have tough enough time coming up with interesting unguessable memorable passwords for my own access areas, and now I have to come up with 10 or 100 others? Blech.

Well, then, let's let the users pick their own username and passwords. After all, it works for most of the sites out there. But how does it work for those sites out there? Let's take a look.

First, although I'm using the ``basic authentication'' protocol -- the kind that pops up the little box in the browser looking for a username and password -- I won't be using the traditional ``htpasswd'' files on the server side. I'm going to invent my own database that relates four items: (1) an email address, (2) a set of ``keys'' that the user owns, (3) the basic-auth username, and (4) the encrypted basic-auth password. Note that the normal ``htpasswd'' files contain only the last two items. A sample file might look like this:

  merlyn@stonehenge.com admin,stoners,perl merlyn xyF9kYWtJIFZ6
  larry@wall.org perl lwall lwUHddn0dCD1I
  fred@flintstone.comm stoners fredf p7xf.fFgemuWM
  barney@rubble.nett stoners rubble 7zCcbFTfzvWJ6

The ``keys'' fit the ``locks'' that protect a particular area, described below. Here, both Larry and I can access anything protected with perl as the lock, while Fred, Barney and I can get into the stoners section, and I alone have the admin key.

Next, I'll extend my Apache server, using mod_perl, to look for and parse these files. I do this with a Perl ``authentication'' and ``authorization'' handler, which replaces or extends Apache's built-in mechanisms to do such. For each section to protect, I'd create an .htaccess file (or perhaps a Location directive in the master configuration file) that looks something like this:

  PerlAuthenHandler Stonehenge::RestrictToList
  AuthName "Friends of Randal Schwartz"
  AuthType Basic
  require stoners
  <Files admin.*>
  require admin,stoners
  </Files>

Now, when this area is visited for the first time by Fred, he'll see a pop-up box asking for his ``Friends of Randal Schwartz'' username and password, which hopefully he'll enter as fredf and whatever encrypts to that string above. Fred still can't access the files that begin with admin though, since he won't have the required admin key.

So far, it doesn't look like anything more than what the web server offers, but here's where it gets interesting.

Suppose instead of giving Fred's username and password, I just left that blank, so the line has only the email address and the list of keys to give that email address. In fact, I'd place a list of all the email addresses for people that I'll let have the stoners key.

Of course, the authentication could not succeed, so we'd normally bounce to that ``authentication denied'' page. But we're going to trap that. We'll have a CGI script waiting there that prompts for an email address, and a requested username and password. If the email address is found, but the username is not already taken, we'll modify Fred's entry from:

  fred@flintstone.comm stoners

to:

  fred@flintstone.comm stoners fredf p7xf.fFgemuWM-abc123456

Note the extra information after ``-''. This information will be emailed to the email address given in the front in the form of a URL to visit, something like:

  Please visit
  http://www.stonehenge.com/cgi/restricttolist?realm=abc&verify=abc123456
  to complete your registration.

And then Fred will hopefully do that, and the same CGI script will edit the database to remove the (matching) part after the dash, and we're done! Each user gets to pick their own username and password, and we have verified that only list members can successfully access the protected area (unless they can intercept mail, but then all bets are off on mail-based authentication anyway).

Now, the CGI script to do the database editing is a large one, so I'm putting it off until next month, but let's get the mod_perl part out of the way first, as shown [listing one, below].

Line 1 is a reminder that we enable this form of authentication and authorization with a PerlAuthenhandler directive in some configuration file, such as an .htaccess file. Although we specify only the authentication hook, there's a component here that also triggers during the authorization phase, inserted beginning in line 58, described later.

Line 3 puts the rest of the file into a package of my choosing, typical for mod_perl handlers. Because the package namespace is global, I choose a private sub-namespace (beginning with Stonehenge::) for all my privately created handlers. If this module were useful enough as-is for others, I'd request some entry in the Apache:: namespace instead from Doug and crew.

Line 4 enables use strict, particularly important for mod_perl handlers to ensure minimal chance of bad programming practices.

Lines 6 and 7 set up a $VERSION string, automatically updated as I perform RCS checkins with this file.

Lines 11 and 12 provide the ``user configuration'' section. While I intend my programs to merely be ``proof of concept'' snippets, I like to try to isolate anything that's most obviously going to need adjustment right up front. Line 11 is the Unix path to the directory containing the realm databases, as described earlier. Line 12 defines the ``authentication failed'' handler as a URL. More on that later.

Lines 16 through 18 pull in the mod_perl modules that we'll need for this handler. More on those as they get used.

Line 20 begins the handler subroutine definition. It's simplest to call this handler. Line 21 is my ``reload if changed'' handler. If you're borrowing this for your own use, just delete the line and remember to restart your server when you change the file.

Lines 23 and 24 set up two commonly needed values. $r is the traditional variable for the ``request object'', used by a handler to access the Apache API. And $log will be used to send messages to the error log as needed.

Lines 26 to 30 interact with the API that handles ``basic authorization''. The call to get_basic_auth_pw returns either an error code, or OK (which we pretend we don't know is just a 200), along with the password used. For non-OK results, we return that immediately. If there's a valid ``basic auth'' password, that ends up in $sent_pw.

Lines 31 and 32 get the other pieces of the ``basic auth'' triad: the user name and the protection realm.

Lines 34 to 43 open up the database for the selected protection realm. First, we construct the name of the file in $name by taking the realm name, ripping out anything that's not alphanumerics, and then prepending the directory name. While the protection realm comes from a configuration file (typically an .htaccess file), we must be cautious that we don't create a filename that might access information outside the directory, thus the need for ripping out anything that is not alphanumeric.

Lines 38 to 42 open up the selected file. If the file does not exist or cannot be opened, we'll do a few standard things. The note_basic_auth_failure call in line 39 ensures that no-one else trusts the user name provided in line 31. Line 40 gives the webmaster a bit of a clue about what went wrong this particular time, and line 41 returns that nasty ``error 500'' to the user. We're justified in this for this particular case, because we've got a mismatch between a realm name provided in a configuration file, and a file that should have been provided for that realm. It would not be nice to return such an error for incorrect user input instead.

So, by the time we hit line 45, we've got a realm name, a user name, a password, and we've got the database open successfully. Next, lines 45 through 50 handle what to do if there's an authorization or authentication failure. We'll set up a custom response from a constructed URL, based on the $FAIL configuration parameter.

Line 46 sets up an Apache::URI object. Lines 47 and 48 create a realm parameter equal to the realm we've verified, and line 49 uses the Apache API to set this as the backstop in case we have to bail from an authentication or authorization error. We do this in a block, so that the temporary variable created in line 46 has a quick chance to go back to the bit bucket.

Line 52 starts the meat of the handler. It's time to walk through the database to see if this particular user and password first can be found, and then can be permitted into this area. As each line is read in to $_, we split it in line 53 to extract the email address (not used here), the keys permitted, the username, and the password. If a line does not yet have a username or password, it is quickly skipped in line 54, as are lines that don't match the presented username.

Line 55 encrypts the presented password to see if it's the same as the encrypted password in the database. If that's so, we have an authenticated user (they have a valid user and password for this realm), and it's time to go on to the next phase (literally). Lines 56 through 73 set up the handler for the authorization phase (described in a moment). Line 74 returns OK indicating the authentication phase is complete.

If the username matches, but the password is a mismatch, we'll bail in lines 76 to 78, noting why for the error log. If the username is not found at all in the database, lines 81 to 83 handle that.

If the authentication phase returns OK, Apache proceeds to the authorization phase, and the pushed handler from lines 57 to 73 kicks in, most likely before any other authorization handler gets a chance at it. Because this is a closure, we have access to the lexical variables established at time that the closure was created, including the keys extracted from the database. The keys are turned into a hash in line 58 for quick comparison.

Lines 60 through 69 walk through the ``requirements'' provided in the authentication from the configuration file (typically .htaccess). The text following require on each requirement is extracted into $op in line 62.

Line 63 gives an automatic OK to authorization if valid-user is given, to emulate the built-in basic authorization.

If the requirement is not simply valid-user, then we have one or more ``locks'' for which the user must hold all the ``keys''. The locks are determined in line 64. Lines 65 through 67 verify that the user has all the required keys. If a user fails to hold a particular key, we'll bump onto the next potential requirement instead.

if we make it to line 68, we are staring at an authenticated user who is indeed authorized to be in this particular area. And we say so. On the other hand, if we drop out of the entry loop, we didn't get a valid combination of keys and locks (or unlocked doors), and we say so in lines 70 through 72.

And that's that! Stick this module into the @INC path of my web server, and I'm ready to start protecting directories with this extended database format.

Of course, the real fun won't come in until the autoregistration CGI script is complete. But I've run out of room in this column, so until part 2 next time, enjoy!

Listings

        =1=     ## PerlAuthenHandler Stonehenge::RestrictToList
        =2=     
        =3=     package Stonehenge::RestrictToList;
        =4=     use strict;
        =5=     
        =6=     use vars qw($VERSION);
        =7=     $VERSION = (qw$Revision$ )[-1];
        =8=     
        =9=     ## config
        =10=    
        =11=    my $DIR = "/home/merlyn/Web/RestrictToList";
        =12=    my $FAIL = "/cgi/restricttolist";
        =13=    
        =14=    ## end config
        =15=    
        =16=    use Apache::Constants qw(:common);
        =17=    use Apache::URI;
        =18=    use Apache::File;
        =19=    
        =20=    sub handler {
        =21=      use Stonehenge::Reload; goto &handler if Stonehenge::Reload->reload_me;
        =22=    
        =23=      my $r = shift;
        =24=      my $log = $r->log;
        =25=    
        =26=      my $sent_pw = do {
        =27=        my ($result,$pw) = $r->get_basic_auth_pw;
        =28=        return $result unless $result == OK;
        =29=        $pw;
        =30=      };
        =31=      my $sent_user = $r->connection->user;
        =32=      my $auth_name = $r->auth_name;
        =33=    
        =34=      my $db_handle = do {
        =35=        my $name = $auth_name;
        =36=        $name =~ tr/A-Za-z0-9//cd;
        =37=        $name = "$DIR/$name";
        =38=        Apache::File->new("<$name") or do {
        =39=          $r->note_basic_auth_failure;
        =40=          $r->log_reason("no database for $auth_name ($name)");
        =41=          return SERVER_ERROR;
        =42=        };
        =43=      };
        =44=    
        =45=      {
        =46=        my $error_uri = Apache::URI->parse($r, $FAIL);
        =47=        $error_uri->query(join "", "realm=",
        =48=                          map "%$_", unpack("H*",$auth_name) =~ /(..)/g);
        =49=        $r->custom_response(AUTH_REQUIRED, $error_uri->unparse);
        =50=      }
        =51=    
        =52=      while (<$db_handle>) {
        =53=        my ($email, $keys, $user, $pw) = split;
        =54=        next unless $user and $user eq $sent_user;
        =55=        if ($pw eq crypt($sent_pw,$pw)) {
        =56=          $r->push_handlers
        =57=            (PerlAuthzHandler => sub {
        =58=               my %keys = map { $_, 1 } split /\W+/, $keys;
        =59=             ENTRY:
        =60=               for my $entry (@{$r->requires}) {
        =61=                 ## entries are or'ed, locks are and'ed
        =62=                 my $op = $entry->{requirement};
        =63=                 return OK if $op eq 'valid-user';
        =64=                 my @locks = split /\W+/, $op;
        =65=                 for my $lock (@locks) {
        =66=                   next ENTRY unless $keys{$lock};
        =67=                 }
        =68=                 return OK;         # the someone we know is OK here
        =69=               }
        =70=               $r->note_basic_auth_failure;
        =71=               $r->log_reason("user $user not keyed for ", $r->uri);
        =72=               return AUTH_REQUIRED;
        =73=             });
        =74=          return OK;                # they are somebody we know
        =75=        }
        =76=        $r->note_basic_auth_failure;
        =77=        $r->log_reason("password $sent_pw not valid for $user");
        =78=        return AUTH_REQUIRED;
        =79=      }
        =80=    
        =81=      $r->note_basic_auth_failure;
        =82=      $r->log_reason("username $sent_user not recognized");
        =83=      return AUTH_REQUIRED;
        =84=    }
        =85=    
        =86=    1;

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.