Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in WebTechniques magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.
Download this listing!

Web Techniques Column 67 (Nov 2001)

[suggested title: Customer Surveys and Writing XML]

Within the realm of good customer service, getting feedback and responding to it in a timely manner is important. A webform is a simple way to do this, but there's all that nagging stuff about validating the form fields and writing the data in a nice clean way for later processing.

Well, I gave that task a little thought, and decided that I could use a table-driven survey form generator, similar to the approach I took in this column in the past [See Programming in Perl, August 1999]. But that approach didn't go far enough, because it didn't permit me to validate the form responses, so I thought about how to add some hooks to handle that when needed.

Also, I started by thinking that I was going to write the survey forms by appending to a flat file (properly flocked so that we get clean writes). But as I pondered the format in which to save the form fields, it dawned on me that a simple XML structure would provide the right clues as to which field went with which value, and the ability to escape special appropriate characters. As I kept working on that, I realized that this would also mean that standard XML tools could be used for data transformation and reduction, and I was quickly convinced that I'd hit pay dirt. Conveniently, the XML::Simple module in the CPAN permits me to construct a typical hash/array-ref tree, and worry about the XML conversion at the last minute (if ever).

And the result is in [listing one, below].

Lines 1 through 3 begin nearly every CGI program I write, turning on taint checking, warnings, compiler restrictions, and disabling the buffering of standard output.

Line 5 pulls in Lincoln Stein's module, along with all of the included shortcuts. Line 6 enables CGI errors to be shown on the browser, very handy while debugging, but a security leak if enabled in production, so remember to remove something like this before deploying this code. Line 7 defines the constants needed for file locking.

The configurable parts begin in line 9. Line 11 is the location of the ever-growing output file. This file needs to be writable by the web userid, unless this script is running set-uid.

Line 13 begins the questions for this survey. Each element of @QUESTIONS is an arrayref, pointing at an array containing (in order): the human-readable label for the question, the param name, the shortcut to create the form field, an arrayref containing any optional parameters for that shortcut (other than name, which is supplied automatically), and optionally, a coderef of a subroutine to run to validate this field, and a flag indicating that multiple response elements for this field are permitted. These fields are discussed in detail when we look at the code that examines them.

Line 62 displays the CGI/HTTP header, and the beginning of the HTML, including a page title, as well as a first-order header indicating the same within the window.

Lines 64 to 95 handle any potential response, inside an eval block so that we can use a die within to abort this part of the operation early and cleanly. If the code executes to its completion, there's nothing to say but ``thank you'', handled in lines 108 to 110.

Otherwise, we display the form, annotated as indicated by the die return value (captured in line 96) in lines 97 to 107. Let's look at that first. Line 98 extracts a prefix from the error message that has been coded to contain the field name (if any) that has been found to be in error, into the $flag variable. We'll use this to turn that row pink in the output.

The remaining error message is displayed in line 99, above the form which is also begin in this line.

Lines 100 to 106 print a table to position the elements of the form. Each row consists of two cells: the label on the left, and the form field on the right. The rows are constructed by scanning through the elements of @QUESTIONS, extracting the label, param name, shortcut function, and any additional arguments (in line 102). The row is turned pink if the param name is equal to the flagged line (line 103). The left cell (label) is generated as a TH element in line 104, with an appropriate alignment.

But the real fun comes in line 105, which looks deceptively simple. The $func variable contains a coderef for a formfield shortcut, such as textfield or popup_menu. Since all of these shortcuts take a named argument list including the name of the param as name, we can call them all the same way, passing the param name for this field, followed by any additional arguments obtained by dereferencing the $opts variable. These additional arguments can be used to override default values for width and height, or provide the items for a radio button group. Simple, but extremely powerful. (I actually danced a little happy jig for a few moments when I got this part of it figured out.)

The form is displayed on the very first invocation of this program, and on any subsequent invocations when $error was set. The form response comes back to this script, but now we go back to line 65 to see where it goes from there.

Line 65 creates a hash of arrayrefs, keyed by a param name, and values of all the param values for that param name. If there are no params provided (such as on the first invocation of this script), the die in line 66 aborts this upper block, and defines the introductory message to show above the empty form as described earlier.

If we have at least some form data, then lines 68 to 77 that data, one field at a time. Each form question arrayref is expanded into its fields: the label, param name, shortcut function, shortcut additional options, and the most important parts for this step, the coderef of the validator, and a multivalue-permitted flag.

Line 70 pulls out the param values for the param name given in $name. Lines 71 and 72 reject any multivalued parameter that has not been permitted to be multivalued. This is probably as a result of someone faking up a form submission, and not as a result of a normal operation.

After determining that a single-valued parameter is not somehow multivalued, the next step is to verify the presence of a specific validator in line 73. If this validator is not present, then the field is presumed valid no matter what value has been provided (useful with optional fields, for example).

Line 74 sets up a temporary value for $_ to either a scalar containing the field value, or a reference to the list of field values if a multivalued parameter is permitted. I found this easier for the validators to execute regular expressions against a ``regional'' $_ value rather than always having to shift @_ or access $_[0]. Line 75 executes the validator, passing it a reference to the %results hash for those rare times when the validity of one field depends on the state of another.

If the validator runs to completion, then the $@ is noticed as being unset in line 76, and we go on to the next field. However, if the validator dies, the error string is prepended with the name of the current field, bracketed by newlines, which are then noticed by the form printing routine as the field name of the line to be flagged in pink. See, it all fits together!

If we run all the validators, and everything looks good, we continue on down to line 81, which begins another inner eval block, this time just to catch errors that we don't expect. It'd be bad to have an unexpected error now show up as a message above an empty form! This block of code dumps the data as XML to a logfile.

Line 83 pulls in the XML::Simple module (found in the CPAN). We don't do this on every hit, because there's a bit of unneeded expense on the invocations that won't even be thinking about XML.

Line 84 converts the %results hash into a valid XML textstream, reflecting the responses in a way that we can extract needed information using any of the popular XML processors. Lines 85 through 88 append this textstream to the $DATAFILE file, locking the file for exclusive access during the write.

For debugging purposes, I also took the XML in line 91, and encoded it for HTML display, wrapped it in a pre shortcut, and put it in a nice boxed single-cell table. This was great while I was testing, to ensure that the resulting XML for a particular form was valid as well as being parseable (and it showed off the XML for the people helping me test the script). Obviously a real program wouldn't do this.

Line 93 takes any (unexpected) error resulting from the eval block in lines 81 to 92 and dumps it to STDERR, just as if the eval block were not there.

And that wraps up the description of the main logic of the program, but a lot of this program lives in the @QUESTIONS data structure, so lets go back and examine that in more detail now.

The first field in line 15 is a simple textfield, named name, with an appropriate label, and optional parameters specifying an on-screen size of 60. The resulting construct acts as if we had directly invoked:

  textfield(-name => 'name', size => 60);

to generate the text field. Perfect.

Lines 16 to 19 set up the second field (called email), also a textfield with a size of 60. However, I've also included a validation subroutine. When the form is submitted, the value of the form field appears in $_ (which can never be an arrayref, because we haven't enabled this as a multivalue field). Thus, the two regular expressions in line 18 which examine $_ are testing the proper data. (For simplicity, I decided that a non-blank email field that didn't contain an @ is not a valid email address.) If the test fails, the die aborts this check, as well as the overall validation pass, causing this form row to be highlighted in pink upon resubmission.

Lines 20 to 23 create a popup menu. The values are given in line 21, which will also set the default to the first item. On form submission, we ensure that this is not the chosen one, and kick it back to the user if so.

Line 24 is another simple textfield, with a different width just to show some variety. Lines 25 to 28 similarly are another popup menu.

Lines 29 to 42 form two related fields. First, we have a checkbox group that may return multiple values selected simultaneously (flagged as the 1 in line 34). These are organized in a single column (line 33). Next, if the checkbox item for Other is selected, we have a textarea for the customer to fill out as to the precise nature of ``other''.

There's no validation for the checkbox group; however, we need ensure that there's some data in the ``other'' description if Other is picked, and this is handled in lines 38 to 41. We use the first (and only) parameter passed into the subroutine, which is a reference to the entire %results hash. Digging down through there, if we see that Other is picked for the checkbox group above, then we need to also have a non-empty value for this text area, and fail otherwise. Most fields won't have such a strong coupling: this is the exception, not the rule, and I'm comfortable with just letting the one subroutine paw through the master data to ensure that the world is sane. If you have ideas for a cleaner and yet flexible interface, let me know.

Just to show a variety of widgets, I include a radio group for the field defined in lines 43 to 48, and a scrolling list in lines 49 to 56. The scrolling list defaults to the first item, including the word Choose. The validation subroutine in line 54 ensures that this is not one of the actual chosen items, meaning the customer had not even looked at this field (or else had reset it, but I can't see why someone would have done that). And finally, line 57 defines a plain text area.

And there you have it. A fairly generic survey form, validating the various form fields in a table-driven manner, along with an XML output module so we can transform or summarize the data using a wide array of XML processing tools. Now there's no excuse for you not to get customer feedback. Until next time, enjoy!


        =1=     #!/usr/bin/perl -Tw
        =2=     use strict;
        =3=     $|++;
        =5=     use CGI qw(:all);
        =6=     use CGI::Carp qw(fatalsToBrowser);
        =7=     use Fcntl ':flock'; # import LOCK_* constants
        =9=     ## CONFIG
        =11=    my $DATAFILE = "/home/merlyn/Web/customer_survey-data";
        =13=    my @QUESTIONS =
        =14=      (
        =15=       ['Name (optional)', 'name', \&textfield, [qw(size 60)]],
        =16=       ['Email (optional)', 'email', \&textfield, [qw(size 60)],
        =17=        sub { die "Please include a full email address!\n"
        =18=                if /\S/ and not /\@/ },
        =19=       ],
        =20=       ['Product', 'product', \&popup_menu,
        =21=        [values => ['Please Choose One', 'thx-1138', 'hal 9000']],
        =22=        sub { die "Please choose a product!\n" if /choose/i },
        =23=       ],
        =24=       ['Model (if applicable)', 'model', \&textfield, [qw(size 30)]],
        =25=       ['Overall impression', 'overall', \&popup_menu,
        =26=        [values => ['Please Choose One', qw(Excellent Good Fair Poor)]],
        =27=        sub { die "Please choose an impression!\n" if /choose/i },
        =28=       ],
        =29=       ['Reason for product choice (choose all that are applicable)',
        =30=        'chose_because',            # referenced below
        =31=        \&checkbox_group,
        =32=        [values => [qw(Price Salesman Quality Performance Reliability Other)],
        =33=         cols => 1],
        =34=        undef, 1
        =35=       ],
        =36=       ['Other reason for product choice (if applicable)', 'chose_because_other',
        =37=        \&textarea, [qw(rows 2 columns 60)],
        =38=        sub {
        =39=          die "Please give your other reason...\n" if not /\S/
        =40=            and grep /other/i, @{shift->{chose_because}};
        =41=        },
        =42=       ],
        =43=       ['Quality', 'quality', \&radio_group,
        =44=        [values => ['Please Choose One', qw(Excellent Good Fair Poor)],
        =45=         cols => 1,
        =46=        ],
        =47=        sub { die "Please choose a quality!\n" if /choose/i },
        =48=       ],
        =49=       ['Area of use (choose all that apply)', 'area', \&scrolling_list,
        =50=        [values => ['Please Choose One Or More', qw(Home School Office)],
        =51=         size => 4,
        =52=         multiple => 'true',
        =53=        ],
        =54=        sub { die "Please choose an area!\n" unless grep !/choose/i, @$_ },
        =55=        1,
        =56=       ],
        =57=       ['Comments', 'comments', \&textarea, [qw(rows 10 columns 50)]],
        =58=      );
        =60=    ## END CONFIG
        =62=    print header, start_html("Customer Survey"), h1("Customer Survey");
        =64=    eval {
        =65=      my %results = map { $_ => [param($_)] } param;
        =66=      die "Please fill out this form...\n" unless %results;
        =68=      for (@QUESTIONS) {
        =69=        my ($label, $name, $func, $opts, $validator, $multi) = @$_;
        =70=        my @values = @{$results{$name} || []};
        =71=        die "\n$name\nToo many values, try again...\n"
        =72=          if @values > 1 and not $multi;
        =73=        next unless $validator and ref $validator eq "CODE";
        =74=        local $_ = $multi ? \@values : $values[0];
        =75=        eval { $validator->(\%results) };
        =76=        die "\n$name\n$@" if $@;
        =77=      }
        =79=      ## made it past the errors, so save it
        =81=      eval {
        =82=        ## so that these don't trigger outer error
        =83=        require XML::Simple;
        =84=        my $out = XML::Simple::XMLout(\%results);
        =85=        open OUT, ">>$DATAFILE" or die "Cannot append to $DATAFILE: $!";
        =86=        flock OUT, LOCK_EX;
        =87=        print OUT $out;
        =88=        close OUT;                  # and release lock
        =90=        ## DEBUG
        =91=        print table({border => 1}, Tr(td(pre(escapeHTML($out)))));
        =92=      };
        =93=      print STDERR $@ if $@;        # if that last thing errored
        =95=    };
        =96=    my $error = $@;
        =97=    if ($error) {
        =98=      my $flag = ($error =~ s/^\n(\S+)\n//) ? $1 : "";
        =99=      print p($error), start_form;
        =100=     print table({border => 0, colspacing => 0, colpadding => 2},
        =101=                 map {
        =102=                   my($label, $name, $func, $opts) = @$_;
        =103=                   Tr({$name eq $flag ? (bgcolor => '#ffcccc') : ()},
        =104=                      th({align => 'right', valign => 'top'}, $label),
        =105=                      td($func->(-name => $name, @$opts)));
        =106=                 } @QUESTIONS);
        =107=     print submit, end_form;
        =108=   } else {
        =109=     print p("Thank you!");
        =110=   }
        =111=   print end_html;

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at or +1 503 777-0095, and welcomes questions on Perl and other related topics.