Copyright Notice

This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.

This text has appeared in an edited form in WebTechniques magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.

Please read all the information in the table of contents before using this article.
Download this listing!

Web Techniques Column 38 (Jun 1999)

CGI programs permit the web surfer to perform actions affecting the state of things on the web server. In many cases, the web server machine is also where the neat stuff lives, so that the CGI script has direct access to files, processes, and databases to fully respond to the request.

However, sometimes, the request needs to be shipped off, from the machine handling the CGI query, to a machine that can actually handle the query. For example, if you rent space on a virtual server to handle your customer's page-fetching activities, and also have CGI scripts to let them ask questions or order products, it's possible that the order entry system actually lives at another machine not directly accessible to the CGI scripts. How do we let the CGI system access the order entry system?

Well, one way is to let the CGI script perform the normal data entry validation, retrying the form until the data is correct as far as can be determined without connecting with the order entry system, and then bundle up a mail message to the order entry system. This mail message contains all the parameters, so that the order entry system can effectively ``process the form'', albeit remotely.

Now, there's nothing terribly tricky about that, until you start considering that form data might be tricky to flatten out into a mail message that survives one or more hops through potentially hostile mail systems. You could construct an ad-hoc encoding scheme so that each field's value is clearly delimited and protected so that odd characters (like embedded newlines or chararacters that you are using for delimiters) don't trip up the system.

But there's an easier way. The standard CGI.pm module provides a clean, complete, mail-safe encoding for all the parameters of a form, and easy methods to load and store that data! This means we can construct a form the usual way with the CGI module, and when the time comes to remotely execute the action, just send some mail. This mail includes the form, encoded by CGI's save routine, in the body of the message. At the receiving end, we'll use CGI.pm to decode the body of the message, and we instantly have access to the form elements with param, just as if the second program was running on the web server.

For this to work, it must be possible to recognize that a given mail message is one of interest (and not just some random spam, yuck!) as well as having a way to trigger a Perl program when the mail arrives. For example, you could dedicate a mail address as a form recipient, or perhaps just notice (using procmail, or MailAgent, or any of a dozen other tools) that a particular subject or body line is present.

So, let's illustrate this process with a toy example. Let's say I have a web form that accepts some information to let someone subscribe to Perl Hackers Weekly. But the subscription information needs to be added to a database on the machine that delivers the messages, not directly accessible from the web server, except by email.

We'll start with the CGI script, in [listing one, below].

Lines 1 through 3 begin many of the Perl scripts I write. We're enabling taint checks (good for CGI scripts), warnings, and compiler restrictions (manadatory variable declaration, no barewords, and no soft references). Also, because I'm launching a process while within taint mode, I've got to set a PATH, even though it really doesn't affect the outcome here.

Line 5 pulls in the CGI module, defining the standard shortcuts.

Lines 6 through 9 print the top of the HTML response to the invocation of this CGI script. The title (as given in start_html) is the same as the first heading (given to h1).

Lines 10 through 17 direct the top-level logic for this CGI script. If the call to validate_form returns some sort of error, we show the form (including the error message), otherwise we thank the user and mail the subscription request off to the machine handling the subscription database. And when that's done, the program is done, because it's all subroutines from there down.

Lines 19 through 39 define the show_form subroutine. First, in line 20, we take that first parameter (probably an error message) and hang on to it in $error. The rest of the subroutine prints out the HTML for the form.

Lines 22 and 38 print horizontal lines above and below the form, to delimit the form visually (a nice design touch). Lines 24 and 37 similarly delimit the form in the HTML world. The default method and action (POSTing the form to the same URL as the script) works fine here. Line 36 adds a submit button.

Lines 25 through 35 produce the labels and fields for the desired information for this form. I'm using a map here to generate many two-element rows in a table. Each row consists of a text paragraph as a label, and a textfield as a form value. Proper use of map can save much time when generating forms. If I had a few more form fields than this, I probably would have created a here-document and split each line to get the fieldname and the label text. In fact, I was already thinking that when I typed this one in.

Lines 41 through 53 validate the form data. This routine needs to stay in close synchronization with the previous subroutine, because the fieldnames must be coordinated. So be careful when you go in for maintenance on this stuff. Additionally, we can't validate anything that requires access to the subscription database, because that's on a separate system, so we're somewhat limited in options here.

Line 42 returns immediately if none of the parameters are filled out, with a polite message. This happens on the first invocation of the form, so it's not fair to chastise the user (yet!).

Lines 43 through 47 see if required fields are missing. Note that a real program might validate the data a little better than just non-false, as I'm doing here. In fact, to illustrate that, let's validate the format of the email address. This is particularly critical because we'll be using the email address at the subscription database machine to send a confirmation email message back to the submitter of this form.

Lines 49 through 51 validate the email address, using the Email::Valid module found in the CPAN (www.cpan.org). Please do not attempt to do this on your own. If you think you know what you're doing, please remember that nearly any character of the entire printable character set can appear in the local address (to the left of the @). If that's a surprise to you (and you thought it was just the alphanumerics or something), then go get Email::Valid instead.

If we make it through the validation, line 52 returns undef, triggering the caller into the ``everything's OK'' response.

Lines 55 through 57 display a short message if we got a good response.

Lines 59 through 72 create an encoded form, encapsulate it as the body of an email message, and mail it off to the subscription database machine. Lines 60 and 61 fire up a sendmail process -- you can also use the mailtools in the CPAN as an alternate way to send mail.

Lines 63 through 65 add the header to the mail message. Here I'm sending mail to myself (although I've mangled the address slightly because people sometimes run these scripts unaltered, and I don't really need any more mail). Also, I've got a very specific subject line. This subject line is how I'll recognize that the mail is for my subscription database handler, and not just a random piece of mail. The mail being sent will appear to be from me as well, although the mail's ``envelope from'' will be the userid of the executing CGI script.

Line 69 tells CGI.pm to add an encoded version of the form parameters to the filehandle opened onto sendmail. Yes, this entire example was just to show this one line, so look at it carefully to get your full money's worth.

Finally, lines 70 and 71 close down the sendmail connection, which completes the mail body, and causes the message to be sent. If there's an error, we'll die with it, and that'll end up in an error log somewhere (I hope).

So, a user fills out the form, submits it (possibly multiple times until the fields are valid), and then gets a thank you note immediately. We then get a mail message off into the mail system. What next?

Well, we need to recognize the mail at the receiving host as being something worth processing. In this case, since all my incoming mail is initially handled by procmail, I'll add a few lines to my .procmailrc file that look like this:

    :0
    * ^Subject:.*//SUBSCRIBE//FORM//
    | $HOME/bin/Subscribe-handler

This particular procmail recipe identifies any message with the subject of //SUBSCRIBE//FORM//, and invokes a program from my personal program directory called Subscribe-handler, placing the mail message on the standard intput of that program.

And Subscribe-handler looks like [listing two, below]

Once again, lines 1 through 3 start pretty much the same way. I'm not using taint mode here, but I am unbuffering standard output (for reasons I don't really remember now, but probably made sense while I was hacking earlier versions of this program, but no big loss).

Line 5 pulls in CGI.pm again. However, note that we are not using it in its normal CGI mode, so we'll add -no_debug to the import list, so that we don't get messages prompting us to provide form data from standard input.

Lines 7 though 9 skip over standard input until we see a blank line, skipping over the header into the body.

Line 11 is the other reason for this whole example. We can create a CGI query object from this incoming mail message's body in just one line. So, again, stare carefully there to get your full money's worth. After this execution, $q-param('email') here is the same as is param('email') in the original CGI script, even if there are odd characters or multiple values!

Lines 13 though 16 ``process'' the fields. Here, I'm just taking all the parameters and dumping them to a single line on standard error, which shows up in my procmail log. Had this not been a toy program, I would have done a database update, or reformatted the data for further processing.

Lines 20 through 35 respond back to the original user with a mail message confirming the operation. Note that the body of this message might include things like a subscriber number or anything else that resulted from the query, just like returning a result form to a user (without the HTML links).

And there you have it... a CGI script that executes ``remotely'', on a different machine from where the action is really happening. For real applications, you might want to revalidate the data on the recieving program, or possibly add encryption (or at least replay prevention) to the message. But hopefully this example gives you enough background to do the job. Enjoy!

Listings

        =0=     ##### LISTING ONE #####
        =1=     #!/usr/bin/perl -Tw
        =2=     use strict;
        =3=     $ENV{PATH} = "/bin:/usr/bin:/usr/ucb";
        =4=     
        =5=     use CGI qw(:standard);
        =6=     print
        =7=       header,
        =8=       start_html("Subscribe to Perl Hackers Weekly"),
        =9=       h1("Subscribe to Perl Hackers Weekly");
        =10=    if (my $error = validate_form()) {
        =11=      show_form($error);
        =12=      print end_html;
        =13=    } else {
        =14=      show_thank_you();
        =15=      print end_html;
        =16=      mail_request();
        =17=    }
        =18=    
        =19=    sub show_form {
        =20=      my $error = shift;
        =21=      print
        =22=        hr,
        =23=        ($error ? p($error) : ()),
        =24=        start_form,
        =25=        table(map
        =26=              Tr(td($_->[0]), td(textfield($_->[1],"",0,60))),
        =27=              ["Name", "name"],
        =28=              ["Address1", "address1"],
        =29=              ["Address2", "address2"],
        =30=              ["City", "city"],
        =31=              ["State", "state"],
        =32=              ["Zip code", "zip"],
        =33=              ["Daytime phone number", "dayphone"],
        =34=              ["email address", "email"],
        =35=             ),
        =36=          submit,
        =37=          end_form,
        =38=          hr;
        =39=    }
        =40=    
        =41=    sub validate_form {
        =42=      return "Tell us about you..." unless param(); # show initial form
        =43=      return "Missing name" unless param("name");
        =44=      return "Missing address 1" unless param("address1");
        =45=      return "Missing city" unless param("city");
        =46=      return "Missing state" unless param("state");
        =47=      return "Missing email" unless param("email");
        =48=      ## verify valid email addr syntax
        =49=      require Email::Valid;
        =50=      return "Bad email address syntax"
        =51=        unless Email::Valid->address(param("email"));
        =52=      return;                       # undef says good
        =53=    }
        =54=    
        =55=    sub show_thank_you {
        =56=      print p("Thank you! You should receive an email confirmation shortly.");
        =57=    }
        =58=    
        =59=    sub mail_request {
        =60=      open SM, "|/usr/lib/sendmail -oi -t"
        =61=        or die "Cannot launch sendmail: $!";
        =62=      print SM <<END;
        =63=    To: merlyn\@stonehenge.con
        =64=    From: merlyn\@stonehenge.con
        =65=    Subject: //SUBSCRIBE//FORM//
        =66=    
        =67=    END
        =68=    
        =69=      $CGI::Q->save(\*SM);
        =70=      close SM;
        =71=      die "sendmail exited with $?" if $?;
        =72=    }
        =0=     ##### LISTING TWO #####
        =1=     #!/usr/bin/perl -w
        =2=     use strict;
        =3=     $|++;
        =4=     
        =5=     use CGI qw(-no_debug);
        =6=     
        =7=     while (<STDIN>) {
        =8=       last if /^\s*$/;
        =9=     }
        =10=    
        =11=    my $q = CGI->new(\*STDIN);
        =12=    
        =13=    ## process the data
        =14=    my $subscriber = join ":",
        =15=      map $q->param($_), 
        =16=      qw(name address1 address2 city state zip dayphone email);
        =17=    
        =18=    print STDERR "$subscriber\n";   # debugging
        =19=    
        =20=    ## send a confirmation
        =21=    my $email = $q->param('email');
        =22=    
        =23=    open SM, "|/usr/lib/sendmail -oi -t"
        =24=      or die "Cannot launch sendmail: $!";
        =25=    print SM <<END;
        =26=    To: $email
        =27=    From: merlyn\@stonehenge.con
        =28=    Subject: Thank you for your subscription!
        =29=    
        =30=    You will now receive Perl Hackers Weekly!
        =31=    
        =32=    END
        =33=    
        =34=    close SM;
        =35=    warn "sendmail exited with $?" if $?;
        =36=    exit 0;

Randal L. Schwartz is a renowned expert on the Perl programming language (the lifeblood of the Internet), having contributed to a dozen top-selling books on the subject, and over 200 magazine articles. Schwartz runs a Perl training and consulting company (Stonehenge Consulting Services, Inc of Portland, Oregon), and is a highly sought-after speaker for his masterful stage combination of technical skill, comedic timing, and crowd rapport. And he's a pretty good Karaoke singer, winning contests regularly.

Schwartz can be reached for comment at merlyn@stonehenge.com or +1 503 777-0095, and welcomes questions on Perl and other related topics.