Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Perl Journal magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Perl Journal Column 07 (Dec 2003)
[Suggested title: ``Blocking spam with Postfix and Amavis'']
Spam: what a mess. The wonderful Mail::SpamAssassin can catch most
of it, looking at various spammy things in headers and bodies of
messages, and even check external realtime block lists (RBLs) to help
determine if a particular piece of email is indeed a reasonable email
or simply someone getting a free ride to promote their own commercial
activity. This month, I'd like to look at a recent change I made
at the stonehenge.com mail server to help deal with the rapidly
increasing amount of spam on the net.
Mail for the stonehenge.com domain is handled by
blue.stonehenge.com, a server at a rack location provided by
Sprocket Data (sprocketdata.com). We're running OpenBSD and
Postfix, because I like to sleep at night and not worry about the
``hack of the week''.
Prior to the recent change, I had most of the mail for the
stonehenge.com domain (with a few notable exceptions) be delivered
to my personal merlyn account's mail, rewriting the destination to
use the ``plus'' extended address format provided by Postfix. I
accomplished this with an /etc/postfix/virtual_regexp entry that
looked something like:
/^stonehenge\.com$/
whatever
# other stonehenge.com rewrites are here
# final catch-all:
/^(.*)@stonehenge.com$/
merlyn+for-stonehenge+$1
I also included this line into /etc/postfix/main.cf:
virtual_maps = regexp:/etc/postfix/virtual_regexp
so that the virtual map was properly specified. As each
email came in, procmail would launch, and consult my .procmailrc
file, which looked something like:
LOGFILE=$HOME/.procmail.log
LOGABSTRACT=yes
## :0c
## $HOME/JustInCase/
:0w:
| Sortmail >>SORTMAIL.LOG 2>&1
LOG="... Sortmail failed, bouncing ...
"
EXITCODE=75
:0
/dev/null
The key here is that each mail would be piped to my Sortmail
program, but if anything went wrong, Postfix would retain the message
in its own queue for a subsequent delivery. This saved my bacon more
than once when I was editing Sortmail and forgot to check syntax
before writing it out.
Within my Sortmail program, I extracted the very first ``Delivered-To''
header, and then undid the transformation applied by the virtual
rewrite. This got me back to the original stonehenge.com address
that had been requested.
I then constructed a Mail::Internet and Mail::Audit object from
the incoming message:
my $mi = Mail::Internet->new(\*STDIN);
my $ma = Mail::Audit->new(data => [$mi->as_string =~ /(.*\n?)/g], noexit=> 1, log => '-', loglevel => 2);
I did this because although I started with Mail::Audit, I later
found out it lacked some of the header access functions that I needed,
so I had to punt and use Mail::Internet instead. Eventually, I
hope to eliminate Mail::Audit entirely, as I've found it to be too
funky and ugly for my needs.
After sorting through the delivery address to determine how the
message would be delivered or autoresponded, I started adding checks
using Mail::SpamAssassin to try not to autorespond to spam or
deliver spam into my significant inboxes. Anything addressed to me
personally that was spammish got dropped into my ``ube'' folder
(unsolicited bulk email). Anything that was not addressed to me got
dropped into my ``ubetrap'' file (as in a ``spamtrap'' address, but I
always pronounced this rhyming with ``boobytrap'').
For any message that required an autoresponse, I eventually hooked in a Template Toolkit-based response template, passing the Mail::Internet object, the Mail::Audit object, and the constructed reply headers. A typical template looks like:
[%
head.Subject = "Your recent message to $to\n";
INCLUDE normal_header;
%]
Why did you send a message to [% to %]?
(It's very possible given the current sorry state of Microsoft
so-called operating systems security that your address has been
forged. If so, please ignore me. Sorry.)
Randal L. Schwartz
postmaster@stonehenge.com
[% INCLUDE signature %]
This template handles ``bounces'' (addresses that aren't otherwise
assigned within stonehenge.com). I send out a human message
instead of a normal sendmail-like message because I need to know if
they really intended the message for some other domain that was
similar to stonehenge.com: it's amazing how many of those are out
there. Most people ignore sendmail-like messages, but they'll respond
in plain English to this letter.
There are other little parts to the mail system, but I hope you get the sense that it's a bunch of bailing wire and duct tape, because it is. And it evolved slowly over time, starting out initially as procmailrc targets, then evolving to use the Perl-based MailAgent, and then to this bizarre hodgepodge.
Initially, this system worked rather fine. I was dealing with about
300 to 500 pieces of email a day, sorting them into mailing list
folders, personal mail, autoresponse mail for our Perl Training
services and my legal case, and handling comp.lang.perl.announce
postings, and a few lightweight ``mailing list'' rebroadcasters. Each
incoming message triggered just two forks (procmail, then my Perl
sortmail), and then got delivered.
But around the summer of 2003, the various Microsoft virus mailers started hitting. They started sending mail from randomly selected addresses to many targets, carrying their DNA along to infect the next system on the list.
Right away, I noticed that I was getting a lot of ``your mail contains a virus'' mail, from well-intentioned anti-virus programs. Let me make this perfectly clear. If you write an anti-virus program, and your anti-virus program can recognize that the virus fakes the ``From'' line, do not send a response to that clearly faked ``from''. These ``your mail contains a virus'' mails are worse than the virus mails themselves, at least for me.
But also what I noticed was a steady increase in the MIRVs. I'm using
MIRV here in the ``multiple independently targetable reentry vehicles''
sense. An incoming spam letter would be addressed to a number of
stonehenge.com addresses, and delivered in one SMTP connection.
The trouble with MIRVs is that they get burst by the local Postfix, and delivered as separate messages (although sharing one Message ID) to separate invocations of procmail, and then to separate invocations of my Sortmail. And each Sortmail would eventually get around to wondering if this was spam, and would make all the regex matches against the mail and all the RBL checks out on the internet, and come to identical conclusions (usually ``yes, it's spam'').
So, each MIRV caused 20 new processes to fire up on my box in the space of about two seconds, beating up on a lot of memory and CPU as those complex regexen were dragged through the mail, and a lot of DNS net traffic to see about RBLs. Ugh. It was a nice design before MIRV spam, but clearly failing now.
But how to fix it? It wouldn't be enough to move to
spamc/spamd, which at least would remove the need to fork and
reload all of that SpamAssassin code on each mail, because we still
are asking the same question ten times because of the MIRVs.
But luckily, I recently stumbled across a Slashdot posting that
mentioned Amavis (``A Mail Virus Scanner''), found at
<http://www.amavis.org/>. Unlike mail-user-agent (MUA) tools like
spamc or simple Mail::SpamAssassin-based custom tools, I could
hook Amavis in at the mail-transfer-agent (MTA) level. Ahh! Before
the MIRV has burst! This looked very promising.
Even more promising is that Amavis is written in Perl, and uses
Spam::Assassin and Net::Server, both technologies with which I
was familiar. I figured that if I had any trouble with Amavis, my
Perl skills were probably sufficient to either reverse-engineer for
understanding or customize the needed features.
Although Amavis can be used as the normal port-25 listener on a server, I didn't want to remove the known reliability and flexibility of having Postfix be my port-25 listener. Luckily, in the installation instructions, I saw how to make Amavis work alongside Postfix, and followed the instructions rather directly.
First, I unpacked Amavis into /opt/amavisd/, and created an etc
and sbin directory alongside the now unpacked source directory. I
also created an amavis user, allowing the home directory to default
to /home/amavis.
Next, I copied amavisd to the sbin directory, and
amavisd.conf to the etc directory. I edited the amavisd.conf
file (putting it under RCS first) to reflect local preferences. Many
of settings were as recommended by the README.postfix file included
with the distribution.
First, I fixed $MYHOME to /home/amavis and $mydomain to
stonehenge.com. (I like that the config file is in Perl and not
some obscure config language.) Then I set $daemon_user and
$daemon_group both to amavis, as I had chosen, and pushed
$TEMPBASE into the tmp subdirectory.
Skimming down, I found the POSTFIX section, uncommenting the lines
for $forward_method and $notify_method there. Finally, I
uncommented the line that set @bypass_virus_checks_acl to a single
period. Since I didn't care about virus checks, and only about spam,
I wanted to keep Amavis single-minded.
I changed the $QUARANTINEDIR to be below /home/amavis for
simplicity. And finally, I commented the $sa_local_tests_only line,
causing SpamAssassin to also consider the RBL tests.
After making all these changes, I then proceeded with the testing as
indicated in the README.postfix file, double checking every step,
because my machine was handling live email. Ultimately, my
/etc/postfix/master.cf was altered to comment out three lines:
#AMAVIS# smtp inet n - - - - smtpd
#AMAVIS# pickup fifo n - - 60 1 pickup
#AMAVIS# cleanup unix n - - - 0 cleanup
replacing those with:
smtp inet n - - - - smtpd
-o cleanup_service_name=pre-cleanup
pickup fifo n - - 60 1 pickup
-o cleanup_service_name=pre-cleanup
cleanup unix n - - - 0 cleanup
-o mime_header_checks=
-o nested_header_checks=
-o body_checks=
-o header_checks=
And adding these new services as well:
pre-cleanup unix n - - - 0 cleanup
-o virtual_alias_maps=
-o canonical_maps=
-o sender_canonical_maps=
-o recipient_canonical_maps=
-o masquerade_domains=
smtp-amavis unix - - y - 2 lmtp
-o smtp_data_done_timeout=1200
-o disable_dns_lookups=yes
127.0.0.1:10025 inet n - y - - smtpd
-o content_filter=
-o local_recipient_maps=
-o relay_recipient_maps=
-o smtpd_restriction_classes=
-o smtpd_client_restrictions=
-o smtpd_helo_restrictions=
-o smtpd_sender_restrictions=
-o smtpd_recipient_restrictions=permit_mynetworks,reject
-o mynetworks=127.0.0.0/8
-o strict_rfc821_envelopes=yes
I also added the startup for amavisd to /etc/rc.local:
if [ -x /opt/amavisd/sbin/amavisd ]; then
echo -n ' amavis ';
sudo -u amavis /opt/amavisd/sbin/amavisd -c /opt/amavisd/etc/amavisd.co\
nf
fi;
But that's it. It's been working quite well for me, and my load average has been significantly less. Now the MIRVs get processed once instead of ten times, and all is well. Almost immediately after making this change, net connections on the box were more stable, and my system didn't suddenly spike and freeze up my Emacs session nearly as much as it formerly did. And I'm not dealing with anywhere near the volume of identical spams, so my personal mail volume is also lower.
So, consider adding Amavis to your MTA, and help fight your neverending spam battle in an efficient way. Until next time, enjoy!

