Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in WebTechniques magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
![]() |
Download this listing! | ![]() |
![]() |
![]() |
Web Techniques Column 46 (Feb 2000)
[suggested title: Uploading files and sending MIME mail]
Most of us have a junk drawer in our house. The one with the random bits of discarded stuff that we hope might be useful in the future, like the last part of a roll of duct tape, a couple of nuts and washers that for some reason weren't needed when we reassembled the shelves this time, and so on. When something needs to get done, we rumble through the drawer looking for something of use, usually to pass the same useless items from time to time, but every once in a while going ``Yeah, I'm so glad I saved that!'' when finding a match.
Well, as a software toolsmith, I have a virtual junk drawer as well. I collect little snippets of code that I see float by, in hopes of reassembling them into some useful tool someday. Recently, I needed to do some file uploads as well as learn how to send MIME mail, and three snippets I'd happened to save all came together in such a nice way that I thought I'd share the program with you. Of course, you can take the program as is, but I hope you instead throw this program into your ``virtual junk drawer'' so that if anyone asks you to upload a file, send a MIME attachment, or strip a macbinary resource fork, you'll find this gem in your drawer.
So, this month we'll do some simple task in a nice way. Upload a file, send it as email somewhere. And if the file happens to be uploaded as a macbinary encapsulated file, we'll even extract the data fork if requested. And that brings us to the program in [listing one, below].
Line 1 tells my Unix-compatible kernel where to find Perl and to turn
on compile-time and run-time warnings. Line 2 enables the common
restrictions: variables must be declared with my
, soft references
are not permitted, and barewords (Perl Poetry Mode) are disabled.
Line 3 disables buffering on STDOUT
, not particularly used here,
but handy during development.
Lines 5 through 12 define the things that you'll most likely want to change to use this program. As in all my columns, this listing is meant to be a model, not something ready-to-run. You're supposed to steal the ideas, not the code. I've also altered these addresses slightly from what would be used in real life so that this program is harmless as-is. I've found that too many ``script kiddies'' just download these programs from my website and then run them without looking at what they contain.
Line 7 defines the From:
address on the mail sent from this script.
The name is arbitrary, but should be a valid address. If the
destination address is unavailable, most likely the mail transfer
agent (MTA) will bounce it back to this From:
address. I usually
use my email address here, or if I'm doing something relating to the
website, my email role address of webmaster
at the web box.
Line 8 similarly defines a destination address for mail sent by this script. If you have procmail or qmail some other mail handling tool, and your mail server allows variant addressing, you can select a unique delivery address for all email coming from this script, and set up a specific handler for just these uploads.
Similarly, line 9 defines the subject line of the email text. Even if you can't use a variant delivery address, you might still be able to trigger some action on a unique subject line to your normal email address. So select the subject line wisely.
Line 10 is a boolean flag (non-zero for true, zero for false) to
select whether to include the meta-data on the upload. If enabled, a
separate text attachment is included in the mail with all the upload
parameters (as reported by CGI.pm
) and all the environment
variables provided to the invocation. This is handy to track down
exactly what this blob is that is being mailed to you, but it's not in
any machine-readable format for this demonstration program. Most
likely, you might be interested in the reported filename, and the
identity of the uploader (host and ``referer'').
Line 14 pulls in the CGI.pm
, enabling all the shortcuts without
using the scary, messy, and funky ``object-oriented'' mode.
Lines 16 triggers the fetching of the parameters, if any. This is
needed to grab the uploaded file early in the program, and abort if
the input parameters are wrong. Most likely that will be from an
ill-formatted file upload, like perhaps a truncation. The upload data
is being sent into a uniquely named file in /tmp
(by default), so
we aren't keeping the uploads contents in memory, just the name.
If there was an error gathering the upload data, cgi_error
will
report it, and lines 17 to 20 will generate an error response, rather
than continuing with the rest of the program.
And now we get to the real code. Lines 22 to 37 print the upload form, regardless of whether or not we got an uploaded file on this invocation. That way, we can use the script to get the initial form, as well as continuing to encourage further uploads in succession.
Line 23 prints the HTTP header, identifying the content as HTML. Line
24 prints the HTML header, titling this response as simply
Upload
. Line 25 gives a first-level header of Upload
as
well. See, this proves this is just a demonstration: in real life,
you'd be much more creative with the titles and headers. Or at least,
I'd hope.
Lines 26 to 37 define the upload form. As is my convention, I enclose
the form in horizontal rules. Line 27 is what distinguishes this form
as an upload form rather than a normal form. Instead of declaring the
encoding type to be application/x-www-form-urlencoded
, we get
multipart/form-data
, which will generate a different response on
the upload, allowing entire files to be uploaded efficiently as
content instead of just formfields.
Lines 28 to 35 define the form contents, inside a table so we can control layout. (Don't tell any of my buddies that suggest that a table should be used solely for content description and not layout: I'll be excommunicated from their group.)
Line 29 is the upload file field. We give a name to the parameter
(here: uploaded_file
) but this parameter is not where the contents
will be returned. They'll be retrieved specially using extra functions
provided by CGI.pm
. This field will show up as a filename box,
probably associated with a Browse...
button. The user will most
likely press the button, bounce around for a file from their system,
and then select or accept that file as the designated uploaded file.
Lines 30 and 31 define the type to use for email. If text
is
selected, the file is mailed as a quoted_printable
representation:
something that can be mostly discerned by the naked eye without
resorting to massive computer power, although still binary-safe for
odd characters. If binary
is selected, the upload content is
encoded in base64
, much more efficient if the text contains many
non-normal bytes (especially bytes in the 128 to 255 region). The
resulting attachment will be identical either way once decoded: this
is merely a selection of how readable it is in transit.
Line 33 selects whether to strip the Macintosh Resource Fork on uploads from Internet Explorer for the Mac. Mac IE insists on wrapping a plain data file in macbinary on uploads, even if there's nothing of interest in the resource fork. Of course, this confuses the heck out of the rest of the world and destroys cross-platform utility. So, if this box is selected, and the file came from IE (thankfully marked by an appropriate content type), then the datafork (what most people would call the contents of the upload) is extracted from inside the macbinary container and sent instead. Of course, Mac IE seems to provide no compatibility button to allow me to make it work like anything else on the Mac, so I can't turn this mode off. Grr.
Now comes the real business. If we were invoked with parameters,
we're ready to receive the uploaded file and send it along its merry
way. Line 40 pulls in the MIME::Lite
module (found in the CPAN). I
do it with a require
here because I want it brought in only if we
are doing an actual sending of mail, saving us time when all we're
doing is generating the initial upload.
If we got a file upload parameter, the upload
function given the
parameter name (defined in the form above) returns the upload object
into $file
in line 41. This upload object is simultaneously the
filename in a string context, or the filehandle when used as a
filehandle. We can pass this string to uploadInfo
to get the
browser-provided information in line 42. We can use this to figure out
what MIME type was reported, for example.
Lines 43 through 45 begin to define the outgoing email, using the
MIME::Lite
constructor of new
. We'll create a message of type
multipart/mixed
as a container, and select an appropriate sender
and receiver address, and subject line.
If the message should contain the meta-data, lines 46 to 56 include
this as the first attachment (so it'll be at the top of the file for
easy processing). The encoding of 7bit
ensures that this text is
mostly readable. However, it's also a promise that no 8-bit data will
be included, so MIME::Lite
will strip such data, so use with
caution.
Lines 49 trhough 54 define the contents of this attachment as computed
data, here an arrayref of text elements. Each element of the hash
referenced by $info
is included, as well as each provided
environment variable. This should give the recipient enough
information to know how and why this file was uploaded.
Lines 57 to 65 handle the core of this script's purpose: attaching the
uploaded file. First, we'll select the encoding in lines 58 through
60. Then, we'll either include a reference to the file with FH
, or
a computed data string if the file is a macbinary file that needs to
be stripped. If the strip_resource_fork
box is checked on the
upload, and the browser reported the file to be macbinary, we pass the
filehandle to the strip_fork_from_fh
subroutine defined below. This
subroutine takes care of ripping out the datafork from the middle of
the macbinary encapsulation.
Line 66 sends the email by connecting to the SMTP port on the local
host. Some web servers don't run email, so you might need to change
this to a cooperative distant host. But I didn't make this a
configuration parameter because it's rare enough that I didn't want to
worry about it. If the send is successful, we'll get a true value,
and report upload sent by email
to the browser.
If an error occurred during sending, I return the entire mail back to
the user in a pre
element. This is clearly not something to do in a
production program, because it would require re-downloading the entire
(possibly large) file just to say something broke. However, it was
great while I was developing the program: I merely added a 0 &&
in
front of the sending code, and this branch of the if
was always
taken, so that I could see the message that would have been sent
without continually reinvoking my mail reading tool.
Line 75 wraps up the invocation. There's no executable code after this point: just the definition of the subroutine, so we're all done.
Lines 77 through 85 define the subroutine that extracts the data fork (what most people would call the real file contents) from the middle of the macbinary encapsulation (which also includes the resource fork and some metadata like type and creator). I got this code off the net, so if it's not entirely accurate, I'm sure one of you will tell me. Basically, the first 128 bytes appears to be a header, and bytes 84 through 87 appear to be a network-order integer value of how many bytes immediately following the header constitute the datafork. So, with the right manipulation, I'm done.
And there you have it, a simple demonstration of three things: how to upload afile, how to send mail with an attachment, and how to strip that annoying resource fork out of file uploads. Until next time, enjoy!
Listings
=1= #!/usr/bin/perl -w =2= use strict; =3= $|++; =4= =5= ## configuration =6= =7= my $FROM = 'webmaster@www.stonehenge.comXX'; =8= my $TO = 'merlyn+upload@stonehenge.comXX'; =9= my $SUBJECT = 'File upload'; =10= my $INCLUDE_META = 1; =11= =12= ## end configuration =13= =14= use CGI qw(:all); =15= =16= my @params = param(); =17= if (my $error = cgi_error()) { =18= print header(-status => $error); =19= exit 0; =20= } =21= =22= print =23= header, =24= start_html("Upload"), =25= h1("Upload"), =26= hr, =27= start_multipart_form, =28= table(Tr(td(p('upload:')), =29= td(filefield('uploaded_file'))), =30= Tr(td(p('email as type:')), =31= td(radio_group('type', [qw(binary text)]))), =32= Tr(td({ -colspan => 2 }, =33= checkbox(-name => 'strip_resource_fork', =34= -label => 'Strip Macbinary Resource Fork'))), =35= Tr(td({ -colspan => 2 }, submit))), =36= end_multipart_form, =37= hr; =38= =39= if (@params) { =40= require MIME::Lite; =41= if (my $file = upload('uploaded_file')) { =42= my $info = uploadInfo($file) or die "info?"; =43= my $msg = MIME::Lite->new =44= (Type => 'multipart/mixed', =45= From => $FROM, To => $TO, Subject => $SUBJECT); =46= if ($INCLUDE_META) { =47= $msg->attach =48= (Type => 'TEXT', Encoding => '7bit', =49= Data => [ =50= "Upload info:\n", =51= (map { "$_ => $info->{$_}\n" } sort keys %$info), =52= "ENV:\n", =53= (map { "$_ => $ENV{$_}\n" } sort keys %ENV), =54= ], =55= ); =56= } =57= $msg->attach =58= ((param('type') eq 'text' ? =59= (Type => 'TEXT', Encoding => 'quoted-printable') : =60= (Type => 'BINARY', Encoding => 'base64')), =61= ((param('strip_resource_fork') && =62= $info->{"Content-Type"} eq "application/x-macbinary") ? =63= (Data => strip_fork_from_fh($file)) : =64= (FH => $file)), =65= ); =66= if ($msg->send_by_smtp('localhost')) { =67= print p("Upload sent by email."); =68= } else { =69= print =70= p("An error occurred... here's what would have been sent:"), =71= pre($msg->as_string); =72= } =73= } =74= } =75= print end_html; =76= =77= sub strip_fork_from_fh { =78= my $fh = shift; =79= =80= my $len = read $fh, (my $buf), 128; # read the header =81= die "short read: $len" unless $len == 128; =82= my $bytes = unpack("x83N", $buf); # get datafork length =83= read $fh, $buf, $bytes; =84= $buf; =85= }