Copyright Notice

This text is copyright by InfoStrada Communications, Inc., and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in Linux Magazine magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.
Download this listing!

Linux Magazine Column 26 (Jul 2001)

[suggested title: Simple online quiz technique (part 2 of 2)]

In last month's column, I described a program to rip through the screenit.com's database of movie reviews, extracting the ``profanity'' paragraphs detailing how nearly 1000 recent movies had used words that some parents might find offensive. This month, I'll look at a quiz engine that picks a movie at random, presents the profanity paragraph, and requests a multiple-choice response to test your knowledge of which movie that paragraph is describing.

This quiz engine differs from many I've seen in that it's nearly cheat-proof. To be a good quiz engine, it's got to randomize the questions and order of answers, and yet provide no clues in the HTML about which answer is correct. Further, to prevent cheating, we shouldn't be able to back up and select a different answer when we've selected the wrong one, or alter any hidden variables to redefine our score.

One solution to prevent cheating is to use some server-side information associated with a session ID, and send down only a session ID to the browser, either as a hidden field, or as part of any URL used in a response. This prevents tampering with either the score or the state of answering questions. These same techniques apply to good ``shopping cart'' design as well, so this isn't as insignificant an application as it might appear at first glance. The session ID should be random, using a relatively strong cryptographic selection process, to prevent ``hijacking'' some other users session.

Although I've written the quiz program as a CGI application, the same techniques would also apply to an Apache mod_perl handler. Speaking of which, the program is presented in [listing one, below].

Lines 1 through 3 start nearly every CGI program I write, enabling taint checking, warnings, compiler restricts, and disabling buffering on STDOUT.

Lines 6 and 7 define the only configuration constants I needed. The $DATA_DB corresponds to a dbmopen'able path to where I placed the database created by last month's program. $COUNT defines how many movie choices will be selected for each question. A value of 1 would be a trivial quiz (just select the only choice), while 5 or more would be formidable. As I played with this a bit, I felt that 3 was a nice challenging number.

Lines 10 and 11 bring in the CGI.pm module and its companion CGI::Carp, both included in the standard Perl distribution. Note I've selected fatalsToBrowser here, which is a security hole if left in production programs, so don't do this. I'm just experimenting, and I wanted the errors to show up in my browser rather than hunting them down in the server logs.

Lines 15 to 25 handle the server-side database. I'm using the very fine File::Cache module (found in the CPAN), which permits time-limited or size-bounded data to be stored (typically somewhere within /tmp/) on the server, keyed by an item of my choice. In this case, I've selected to have the session information expire in an hour, so as long as someone is answering new questions less than an hour apart, the data stays alive.

Lines 22 to 25 handle the occasional ``purge'' of the database. Every four hours, some lucky dog comes along when the special purge key (of my choosing) has expired. This triggers a cache purging which probably takes a few extra seconds, then a new purge key entry gets inserted to prevent this from happening for another four hours. You can tell File::Cache to do this on every access, but that seems excessive.

The File::Cache module is found in the CPAN (at many places such as www.cpan.org), but is being phased out for the more general Cache::Cache module, which at this writing has started entering alpha testing. I presume File::Cache will remain around for a while, but you might look at Cache::Cache instead if you are writing your own code.

Line 29 opens the database. Yes, I still use dbmopen mostly out of sheer simplicity, even though it claims to be deprecated.

Lines 31 to 37 define the ``session'' data being saved and loaded for every hit for a given quiz-taker. @this_keys is the movie choices being presented for this question. @unused_keys are the remaining movies to choose from (preventing duplicates). $winner is an index into @this_keys defining the winner. $answered and $correct maintain the score.

Line 39 prints the CGI header, the HTML header, and a nice h1 to start off the page.

Lines 42 to 52 pull up any existing session information. The session tag is a 32 character hex string (we'll see later how this gets generated) included as part of the GET request query parameters. If it's valid, and recent enough, the session variables get loaded from the cache. If anything fails here, $session remains undefined so that we start a new session below, presuming a new quiztaker.

If we're in a valid session, the code in lines 55 to 80 handles the incoming previous quiz guess. The value of $answer is expected to be the answer number followed by a dash and the selected answer. Normally, this answer number should be the same as our session data of $answered, and if so, lines 61 to 73 determine if it's the proper answer, displaying the appropriate text. The text includes a reference to screenit.com's web page for further verification, whether or not the answer is correct. The movie name is obtained via the database, looking at the first line of the value up to the newline.

However, if the incoming answer number is not the same as our session data's answer number, it means the quiz taker has either answered the question already, or is otherwise trying to cheat. In this case, the message in line 75 reminds them to stay on track.

If the session number is not valid, it's time to start a new session, handled by lines 82 to 86. We're using the MD5 module from the CPAN, and computing a session ID in the same way that the Apache::Session module (also in the CPAN) computes it. I'm not sure how secure this is, but if it's good enough for Apache::Session, it's good enough for me to steal for this program.

Lines 89 to 92 select a winner if there's not an active question, by drawing from the list of keys so far unused from the database.

Line 94 displays the header for the question part of the web page.

Lines 96 through 105 save the current session data into the cache by creating an anonymous hash and then storing it using the default expiration (one hour, specified earlier).

Lines 107 saves us a bit of time by caching the values for the list of keys for the current question choices out of the database.

Lines 109 to 111 give credit to screenit.com (in the form of an outbound link), and define the context of the question.

Lines 114 to 115 present the profanity paragraph from the database (via the local cache) in a bordered table with a single cell. The value consists of the title, a newline, and then the paragraph of profanity data (with embedded newlines) ripped from the screenit.com's pages. Since we don't want to give away the title here, we'll print everything after the first newline.

Lines 118 to 122 display the answer choices, as a ordered list. Each list item is a link back to this same CGI program (as captured in line 118) with the session information and the answer identifier included as query parameters. The link text is the title of the movie, given as the first line of the cached data value.

Lines 124 and 125 are commented out, but while I was testing, I was too lazy to think of what the right answer might be, so I made the program tell me.

Lines 127 to 136 append some boilerplate disclaimers to the bottom of the quiz question. Line 137 closes off the HTML. And we're done.

So, to recap, when the program is first invoked, there's no session data, so we create a session ID (line 83), and pull up a list of all movies in the database (line 84), and generate the first list of candidates (line 90) and select the winner (line 91). And all this is saved into server-side storage (line 104). Then the user is presented with the profanity paragraph for that winner (line 114), and a list of choices (line 119) from which to choose. These links lead back to another invocation of the same program, which then scores the choice (line 65) and repeats the process, picking a new set of choices, and updating the server-side storage.

Easy once you've seen it, but I wrangled with this thing for a few hours thinking through all the failure paths and cheater paths, and I think I've whacked out something that does the job pretty effectively.

Of course, this program can be extended in multiple ways. For example, a scoreboard of best scores could be maintained, or at least a way to associate a name with each score. And there's the little mess of what happens when all the movies are used up from one pass. Should the quiz be over? Also, I'm using up movies from the list as alternate choices, but maybe the incorrect choices should just be drawn at random from the master list, and not from the unused list. Lots of possibilities: it's all a matter of programming, as they say.

Hope you've had fun working with this quiz generator. Until next time, enjoy!

Listings

        =1=     #!/usr/bin/perl -Tw
        =2=     use strict;
        =3=     $|++;
        =4=     
        =5=     ## config
        =6=     my $DATA_DB = "/home/merlyn/Web/profanity_quiz";
        =7=     my $COUNT = 3;
        =8=     ## end config
        =9=     
        =10=    use CGI qw(:all);
        =11=    use CGI::Carp qw(fatalsToBrowser);
        =12=    
        =13=    ## set up the cache
        =14=    
        =15=    use File::Cache;
        =16=    my $cache = File::Cache->new({namespace => 'profanityquiz',
        =17=                                  username => 'nobody',
        =18=                                  filemode => 0666,
        =19=                                  expires_in => 3600, # one hour
        =20=                                 });
        =21=    
        =22=    unless ($cache->get(" _purge_ ")) { # cleanup?
        =23=      $cache->purge;
        =24=      $cache->set(" _purge ", 1, 3600 * 4); # purge every four hours
        =25=    }
        =26=    
        =27=    ## connect to the database
        =28=    
        =29=    dbmopen my %DATA, $DATA_DB, 0666 or die "Cannot open data: $!";
        =30=    
        =31=    ## session info
        =32=    my @unused_keys;
        =33=    my @this_keys;
        =34=    my $winner;
        =35=    my $answered;
        =36=    my $correct;
        =37=    ## end session info
        =38=    
        =39=    print header, start_html("Guess the profanity"), h1("Guess the profanity");
        =40=    
        =41=    ## first, pull up existing session data:
        =42=    my $session = param('session');
        =43=    if (defined $session and $session =~ /\A[0-9a-f]{32}\z/
        =44=        and my $data = $cache->get($session)) {
        =45=      @unused_keys = @{$data->{unused_keys}};
        =46=      @this_keys = @{$data->{this_keys}};
        =47=      $winner = $data->{winner};
        =48=      $answered = $data->{answered};
        =49=      $correct = $data->{correct};
        =50=    } else {
        =51=      undef $session;               # no good, so ignore
        =52=    }
        =53=    
        =54=    ## now handle form response if within a valid session:
        =55=    if ($session) {
        =56=      if (defined(my $answer = param('answer'))) {
        =57=        if (my ($guess_answered, $guess_guessed) = $answer =~ /(\d+)-(\d+)/) {
        =58=          if (0 <= $guess_guessed and $guess_guessed <= $#this_keys) {
        =59=            print h2("Scoring");
        =60=            if ($guess_answered == $answered) {
        =61=              $answered += 1;
        =62=              print "You guessed ",
        =63=                a({-href => "http://$this_keys[$guess_guessed]"},
        =64=                  $DATA{$this_keys[$guess_guessed]} =~ /(.*)/), ", ";
        =65=              if ($guess_guessed == $winner) {
        =66=                $correct += 1;
        =67=                print "which is correct!";
        =68=              } else {
        =69=                print "which is wrong. The correct answer is ",
        =70=                  a({-href => "http://$this_keys[$winner]"},
        =71=                    $DATA{$this_keys[$winner]} =~ /(.*)/), ".";
        =72=              }
        =73=              @this_keys = ();
        =74=            } else {
        =75=              print "You've already answered this!  Stop trying to cheat!";
        =76=            }
        =77=            print p("Your total score so far is $correct out of $answered.");
        =78=          }
        =79=        }
        =80=      }
        =81=    } else {                        # start a new session:
        =82=      require MD5;
        =83=      param('session', $session = MD5->hexhash(MD5->hexhash(time.{}.rand().$$)));
        =84=      @unused_keys = keys %DATA;
        =85=      $answered = $correct = 0;
        =86=      @this_keys = ();
        =87=    }
        =88=    
        =89=    unless (@this_keys) {           # pick a new question
        =90=      push @this_keys, splice @unused_keys, rand @unused_keys, 1 for 1..$COUNT;
        =91=      $winner = int rand @this_keys;
        =92=    }
        =93=    
        =94=    print h2("Show us how smart you are...");
        =95=    
        =96=    ## save session data for next hit:
        =97=    {
        =98=      my $data = {};
        =99=      @{$data->{unused_keys}} = @unused_keys;
        =100=     @{$data->{this_keys}} = @this_keys;
        =101=     $data->{winner} = $winner;
        =102=     $data->{answered} = $answered;
        =103=     $data->{correct} = $correct;
        =104=     $cache->set($session, $data);
        =105=   }
        =106=   
        =107=   my @this_values = @DATA{@this_keys}; # cache %DATA we need
        =108=   
        =109=   print
        =110=     "Which one of these movies had this profanity information at ",
        =111=     a({-href => 'http://www.screenit.com/'}, "screenit.com"), "?";
        =112=   
        =113=   ## pull up the profanity paragraph, boxed for easy reading:
        =114=   print table({-border => 1, -cellspacing => 0, -cellpadding => 5},
        =115=               Tr(td($this_values[$winner] =~ /^.*\n([\s\S]+)/)));
        =116=   
        =117=   ## show the choices, with links back to us including session tag:
        =118=   my $url = url;
        =119=   print ol(map {
        =120=     my ($title) = $this_values[$_] =~ /(.*)/;
        =121=     li(a({-href => "$url?session=$session&answer=$answered-$_"}, $title));
        =122=   } 0..$#this_values);
        =123=   
        =124=   ## (for debugging, because I was lazy... :-)
        =125=   ## print "\n(Hint: the answer is ", $this_values[$winner] =~ /(.*)/, ")\n";
        =126=   
        =127=   print h2("Disclaimer");
        =128=   
        =129=   print "All decisions of our judges are final. ",
        =130=     "Even if two movies have the same answer. ";
        =131=   
        =132=   print "And ",
        =133=     a({-href => 'http://www.screenit.com/'}, "screenit.com"),
        =134=     " had nothing to do with this program.  It's all ",
        =135=     a({-href => "/merlyn/"}, "my"), " fault. ";
        =136=   
        =137=   print end_html;

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.