Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in WebTechniques magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.
Download this listing!

Web Techniques Column 64 (Aug 2001)

[suggested title: Getting One-Click Processing]

One thing that seems to plague many web programmers is how to get ``one click'' processing. No, I'm not talking about Amazon's patented technology. I'm talking about that annoyance when a user clicks on the submit button two or three times before the response begins to change the browser, and you end up with a form that's been submitted multiple times.

This is no big deal if the request is idempotent: that is, a repeated request generates a consistent result and doesn't change the state of the world incrementally. However, many form submissions do indeed intend to change the state of the world somehow. For example, a shopping cart will have a ``buy this item'', and multiple clicks might end up filling up the cart with many copies of an item wanted only once. Or a survey form ends up voting multiple times for a particular choice. Or a guestbook or chat room gets multiple copies of the message.

The solution often given in the communities I participate in often includes some mention of Javascript. I'm not a big fan of having any mandatory functionality in Javascript; in fact, I'm a rather vocal opponent. No offense to the Javascript people, but more people these days than ever before are turning Javascript off, thanks to the repeated CERT warnings and security holes, not to mention those evil sites that pop up windows with advertising. And because of the security concerns, more companies are installing firewalls that strip the Javascript at the corporate gateway as a protection against malice, so it isn't even a choice of the user to ``please enable your Javascript'' in many cases.

But luckily, there's a simple server-side solution that works with standard HTML. Simply include a unique serial number as you generate the form, and note that serial number in a lightweight server-side database. When the form is submitted, verify that the serial number is still in the database, and if so, process the form and remove the serial number. If the serial number is absent, redirect the user back to the form fillout page, or on to the next step if needed, since you've already processed the form.

Although this technique requires a bit of work, it's actually rather painless to implement, adding maybe 25 lines of Perl code to your application.

But I also decided to take this a bit further. Sometimes when a script that both processes a form and handles its response is updated, a new field might be added, or the meaning for an existing field be altered. But if there's a form out there already being filled out, you'll end up with parameter names that don't quite fit, but you don't know it.

So as long as we're remembering via a hidden field that this form was generated recently, let's also remember the modification time of the script itself, and reject any invocations originally generated by an older version of the script! Sure, it's a bit annoying to the users, but much better than the annoyance of subtle (or dangerous) mismatches of parameter data.

In particular, we'll be noting three things from stat: the device number, the inode number, and the ``ctime'' value. A script cannot change without altering at least the last item (unless it changes twice within one second), so this is pretty robust.

And the program that does all of this is presented in [listing one, below].

Lines 1 through 3 start nearly every CGI program I write, turning on taint checking, warnings, compiler restrictions, and unbuffering standard output.

Line 5 pulls in the CGI.pm module, with all of the functions imported into the current (main) namespace.

Lines 7 though 19 set up the lightweight database. I'm using the very slick File::Cache module found in the CPAN, which stores temporary information with expiration times into /tmp in a nice controlled way. The author is updating this module to a generic architecture called Cache::Cache, but unfortunately a stable version of that module is not ready as I'm writing this column.

Lines 10 through 14 connect up with the ``database''. By default, the serial numbers will expire in one hour, which should be enough time for an unprocessed form to be filled out and still considered valid. Note that this time is significant only for forms that are presented to the user but never submitted, thus accumulating entries in the database. Forms that are properly submitted clear the serial number from the database immediately.

Lines 16 through 19 handle the occasional purging of the old entries. Once an hour, some lucky dog gets to paw through the entries to find the ones that have gone past their expiry. We don't do this on every open or every cache update (the default modes provided by File::Cache), because it's really more work than it needs to be. I've asked the author of File::Cache to add this feature, and he's added this to Cache::Cache for me as a configuration option, rather than having to code this explicitly. Joy.

Line 21 computes a ``script ID'': the string that will change only when the script is edited. Calling stat on Perl's $0 gets the info for the current script, from which we select the device, inode, and ctime numbers, which are then joined into a single string.

Line 23 prints the HTTP/CGI header, the HTML header, and an H1 header.

Lines 25 to 48 handle the invocation of the script when parameters are present, like in response to a form action. Since that's second, I'll come back to these in a moment.

Lines 50 to 68 handle the initial invocation of the script, where we generate the form with a nice unique serial number.

First, we get a serial number (which I'm calling ``session'' here because I cut-n-pasted this code from another program where it was the session) in lines 53 to 55. I'll shove it in as a parameter so that we can generate the hidden parameter easily as a ``sticky'' field. I'm using the MD5 algorithm from Apache::Session, which is apparently good enough for them so it's good enough for me, although if you're really paranoid you should check out Math::TrulyRandom in the CPAN.

Line 58 updates the database (cache) by creating an entry keyed by the session (serial) ID. The value is set to the script ID, which we'll verify when the form is submitted to ensure that the same script that generated the form also is processing the form.

Lines 61 to 68 are your standard boring form stuff. In fact, I just copied this from the first few lines of CGI.pm's documentation. The only interesting thing I'm doing here is that I wanted a pop-up menu to also have an ``other'' entry, and I read about this trick recently. Simply make a textfield with the same name as the pop-up menu, and then process them together as a multi-valued field. We'll see how that works in a moment.

Note that the hidden field is included as part of the form in line 67. We'll be looking for that on the response to accept the form.

And as long as we're looking down here, line 72 prints the end of the HTML whether we're printing out the form or responding to it.

OK, back up to the form response part, starting in line 26. First, line 30 fetches that hidden serial number in the session parameter.

If that's defined, lines 31 to 34 look for the session number in the database (cache). If it's found, we remove it, thus preventing any other submission with the same session key. There's a small race condition here, but I've tried to keep these pretty close together to minimize the window. If you're concerned about that, replace the File::Cache database with a database that can handle atomic test-and-update.

Further, the value of that database entry must also match the script ID of the current script. If not, we've updated the script after the form was generated, and we thus will reject the form data as well.

But if that all matches up (first invocation of valid session, and script IDs match), then it's time to really process the data, starting in line 37.

Lines 39 and 40 grab the name, setting it to (Unspecified) if it wasn't there or was empty.

Line 41 grabs the color. Now recall that there's actually two fields called color in the form: the pop-up menu, and the text field following it. The result from param('color') will most likely be a two-element list. Using the grep, we remove -other- if it's present. Thus, if a color was selected on the pop-up, we now have that color plus perhaps whatever was typed in the box, but if ``-other-'' was selected, then we have just the value of the other box. We'll save the first element of that result into $color. And thus, with a bit of magic, we've got the pop-up color, unless it's ``-other-'', in which case we've got the text field.

This all presumes that browsers return the values in the order specified on the form. That's merely a recommendation, if I recall, and a convention, so things could get messed up. (Please write me if you know otherwise, and I'll follow up in a future column!)

And once we've got the values, we ``process'' the data simply by sending it back out to the screen in a nice way in lines 43 and 44. Of course, in a real application, this is where the real meat would be. The HTML of the values are escaped, because a user might have less-thans or ampersands in their name or color, and we wouldn't want that to mess up the output.

Lines 46 and 47 are reached when a form is submitted without the proper session ID. So this means that either someone doubled-up on the submit button quickly, or pressed reload on the form data, or the form was from more than an hour ago, or the form was generated by a prior version of this script, or just faked up as a URL somewhere. In this case, we inform the user, and provide a link to start over.

And there you have it. ``One Click'' (no trademark here) processing, along with solving a couple of other interesting issues about pop-up forms and script maintenance. Until next time, enjoy!

Listings

        =1=     #!/usr/bin/perl -Tw
        =2=     use strict;
        =3=     $|++;
        =4=     
        =5=     use CGI qw(:all);
        =6=     
        =7=     ## set up the cache
        =8=     
        =9=     use File::Cache;
        =10=    my $cache = File::Cache->new({namespace => 'surveyonce',
        =11=                                  username => 'nobody',
        =12=                                  filemode => 0666,
        =13=                                  expires_in => 3600, # one hour
        =14=                                 });
        =15=    
        =16=    unless ($cache->get(" _purge_ ")) { # cleanup?
        =17=      $cache->purge;
        =18=      $cache->set(" _purge_ ", 1);
        =19=    }
        =20=    
        =21=    my $SCRIPT_ID = join ".", (stat $0)[0,1,10];
        =22=    
        =23=    print header, start_html("Survey"), h1("Survey");
        =24=    
        =25=    if (param) {
        =26=      ## returning with form data
        =27=    
        =28=      ## verify first submit of this form data,
        =29=      ## and from the form generated by this particular script only
        =30=      my $session = param('session');
        =31=      if (defined $session and do {
        =32=        my $id = $cache->get($session);
        =33=        $cache->remove($session);   # let this be the only one
        =34=        $id and $id eq $SCRIPT_ID;
        =35=      }) {
        =36=        ## good session, process form data
        =37=        print h2("Thank you");
        =38=        print "Your information has been processed.";
        =39=        my $name = param('name');
        =40=        $name = "(Unspecified)" unless defined $name and length $name;
        =41=        my ($color) = grep $_ ne '-other-', param('color');
        =42=        $color = "(Unspecified)" unless defined $color and length $color;
        =43=        print p, "Your name is ", b(escapeHTML($name));
        =44=        print " and your favorite color is ", b(escapeHTML($color)), ".";
        =45=      } else {
        =46=        print h2("Error"), "Hmm, I can't process your input.  Please ";
        =47=        print a({href => script_name()}, "start over"),".";
        =48=      }
        =49=    } else {
        =50=      ## initial invocation -- print form
        =51=    
        =52=      ## get unique non-guessable stamp for this form
        =53=      require MD5;
        =54=      param('session',
        =55=            my $session = MD5->hexhash(MD5->hexhash(time.{}.rand().$$)));
        =56=    
        =57=      ## store session key in cache
        =58=      $cache->set($session, $SCRIPT_ID);
        =59=    
        =60=      ## print form
        =61=      print hr, start_form;
        =62=      print "What's your name? ",textfield('name'), br;
        =63=      print "What's your favorite color? ";
        =64=      print popup_menu(-name=>'color',
        =65=                       -values=>[qw(-other- red orange yellow green blue purple)]);
        =66=      print " if -other-: ", textfield('color'), br;
        =67=      print hidden('session');
        =68=      print submit, end_form, hr;
        =69=    
        =70=    }
        =71=    
        =72=    print end_html;

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.