Copyright Notice
This text is copyright by CMP Media, LLC, and is used with their permission. Further distribution or use is not permitted.This text has appeared in an edited form in Perl Journal magazine. However, the version you are reading here is as the author originally submitted the article for publication, not after their editors applied their creativity.
Please read all the information in the table of contents before using this article.
Perl Journal Column 07 (Dec 2003)
[Suggested title: ``Blocking spam with Postfix and Amavis'']
Spam: what a mess. The wonderful Mail::SpamAssassin
can catch most
of it, looking at various spammy things in headers and bodies of
messages, and even check external realtime block lists (RBLs) to help
determine if a particular piece of email is indeed a reasonable email
or simply someone getting a free ride to promote their own commercial
activity. This month, I'd like to look at a recent change I made
at the stonehenge.com
mail server to help deal with the rapidly
increasing amount of spam on the net.
Mail for the stonehenge.com
domain is handled by
blue.stonehenge.com
, a server at a rack location provided by
Sprocket Data (sprocketdata.com
). We're running OpenBSD and
Postfix, because I like to sleep at night and not worry about the
``hack of the week''.
Prior to the recent change, I had most of the mail for the
stonehenge.com
domain (with a few notable exceptions) be delivered
to my personal merlyn
account's mail, rewriting the destination to
use the ``plus'' extended address format provided by Postfix. I
accomplished this with an /etc/postfix/virtual_regexp
entry that
looked something like:
/^stonehenge\.com$/ whatever # other stonehenge.com rewrites are here # final catch-all: /^(.*)@stonehenge.com$/ merlyn+for-stonehenge+$1
I also included this line into /etc/postfix/main.cf
:
virtual_maps = regexp:/etc/postfix/virtual_regexp
so that the virtual map was properly specified. As each
email came in, procmail
would launch, and consult my .procmailrc
file, which looked something like:
LOGFILE=$HOME/.procmail.log LOGABSTRACT=yes ## :0c ## $HOME/JustInCase/ :0w: | Sortmail >>SORTMAIL.LOG 2>&1 LOG="... Sortmail failed, bouncing ... " EXITCODE=75 :0 /dev/null
The key here is that each mail would be piped to my Sortmail
program, but if anything went wrong, Postfix would retain the message
in its own queue for a subsequent delivery. This saved my bacon more
than once when I was editing Sortmail
and forgot to check syntax
before writing it out.
Within my Sortmail program, I extracted the very first ``Delivered-To''
header, and then undid the transformation applied by the virtual
rewrite. This got me back to the original stonehenge.com
address
that had been requested.
I then constructed a Mail::Internet
and Mail::Audit
object from
the incoming message:
my $mi = Mail::Internet->new(\*STDIN); my $ma = Mail::Audit->new(data => [$mi->as_string =~ /(.*\n?)/g], noexit=> 1, log => '-', loglevel => 2);
I did this because although I started with Mail::Audit
, I later
found out it lacked some of the header access functions that I needed,
so I had to punt and use Mail::Internet
instead. Eventually, I
hope to eliminate Mail::Audit
entirely, as I've found it to be too
funky and ugly for my needs.
After sorting through the delivery address to determine how the
message would be delivered or autoresponded, I started adding checks
using Mail::SpamAssassin
to try not to autorespond to spam or
deliver spam into my significant inboxes. Anything addressed to me
personally that was spammish got dropped into my ``ube'' folder
(unsolicited bulk email). Anything that was not addressed to me got
dropped into my ``ubetrap'' file (as in a ``spamtrap'' address, but I
always pronounced this rhyming with ``boobytrap'').
For any message that required an autoresponse, I eventually hooked in a Template Toolkit-based response template, passing the Mail::Internet object, the Mail::Audit object, and the constructed reply headers. A typical template looks like:
[% head.Subject = "Your recent message to $to\n"; INCLUDE normal_header; %] Why did you send a message to [% to %]?
(It's very possible given the current sorry state of Microsoft so-called operating systems security that your address has been forged. If so, please ignore me. Sorry.)
Randal L. Schwartz postmaster@stonehenge.com [% INCLUDE signature %]
This template handles ``bounces'' (addresses that aren't otherwise
assigned within stonehenge.com
). I send out a human message
instead of a normal sendmail-like message because I need to know if
they really intended the message for some other domain that was
similar to stonehenge.com
: it's amazing how many of those are out
there. Most people ignore sendmail-like messages, but they'll respond
in plain English to this letter.
There are other little parts to the mail system, but I hope you get the sense that it's a bunch of bailing wire and duct tape, because it is. And it evolved slowly over time, starting out initially as procmailrc targets, then evolving to use the Perl-based MailAgent, and then to this bizarre hodgepodge.
Initially, this system worked rather fine. I was dealing with about
300 to 500 pieces of email a day, sorting them into mailing list
folders, personal mail, autoresponse mail for our Perl Training
services and my legal case, and handling comp.lang.perl.announce
postings, and a few lightweight ``mailing list'' rebroadcasters. Each
incoming message triggered just two forks (procmail, then my Perl
sortmail), and then got delivered.
But around the summer of 2003, the various Microsoft virus mailers started hitting. They started sending mail from randomly selected addresses to many targets, carrying their DNA along to infect the next system on the list.
Right away, I noticed that I was getting a lot of ``your mail contains a virus'' mail, from well-intentioned anti-virus programs. Let me make this perfectly clear. If you write an anti-virus program, and your anti-virus program can recognize that the virus fakes the ``From'' line, do not send a response to that clearly faked ``from''. These ``your mail contains a virus'' mails are worse than the virus mails themselves, at least for me.
But also what I noticed was a steady increase in the MIRVs. I'm using
MIRV here in the ``multiple independently targetable reentry vehicles''
sense. An incoming spam letter would be addressed to a number of
stonehenge.com
addresses, and delivered in one SMTP connection.
The trouble with MIRVs is that they get burst by the local Postfix, and delivered as separate messages (although sharing one Message ID) to separate invocations of procmail, and then to separate invocations of my Sortmail. And each Sortmail would eventually get around to wondering if this was spam, and would make all the regex matches against the mail and all the RBL checks out on the internet, and come to identical conclusions (usually ``yes, it's spam'').
So, each MIRV caused 20 new processes to fire up on my box in the space of about two seconds, beating up on a lot of memory and CPU as those complex regexen were dragged through the mail, and a lot of DNS net traffic to see about RBLs. Ugh. It was a nice design before MIRV spam, but clearly failing now.
But how to fix it? It wouldn't be enough to move to
spamc
/spamd
, which at least would remove the need to fork and
reload all of that SpamAssassin code on each mail, because we still
are asking the same question ten times because of the MIRVs.
But luckily, I recently stumbled across a Slashdot posting that
mentioned Amavis (``A Mail Virus Scanner''), found at
<http://www.amavis.org/>. Unlike mail-user-agent (MUA) tools like
spamc
or simple Mail::SpamAssassin
-based custom tools, I could
hook Amavis in at the mail-transfer-agent (MTA) level. Ahh! Before
the MIRV has burst! This looked very promising.
Even more promising is that Amavis is written in Perl, and uses
Spam::Assassin
and Net::Server
, both technologies with which I
was familiar. I figured that if I had any trouble with Amavis, my
Perl skills were probably sufficient to either reverse-engineer for
understanding or customize the needed features.
Although Amavis can be used as the normal port-25 listener on a server, I didn't want to remove the known reliability and flexibility of having Postfix be my port-25 listener. Luckily, in the installation instructions, I saw how to make Amavis work alongside Postfix, and followed the instructions rather directly.
First, I unpacked Amavis into /opt/amavisd/
, and created an etc
and sbin
directory alongside the now unpacked source directory. I
also created an amavis
user, allowing the home directory to default
to /home/amavis
.
Next, I copied amavisd
to the sbin
directory, and
amavisd.conf
to the etc
directory. I edited the amavisd.conf
file (putting it under RCS first) to reflect local preferences. Many
of settings were as recommended by the README.postfix
file included
with the distribution.
First, I fixed $MYHOME
to /home/amavis
and $mydomain
to
stonehenge.com
. (I like that the config file is in Perl and not
some obscure config language.) Then I set $daemon_user
and
$daemon_group
both to amavis
, as I had chosen, and pushed
$TEMPBASE
into the tmp
subdirectory.
Skimming down, I found the POSTFIX
section, uncommenting the lines
for $forward_method
and $notify_method
there. Finally, I
uncommented the line that set @bypass_virus_checks_acl
to a single
period. Since I didn't care about virus checks, and only about spam,
I wanted to keep Amavis single-minded.
I changed the $QUARANTINEDIR
to be below /home/amavis
for
simplicity. And finally, I commented the $sa_local_tests_only
line,
causing SpamAssassin to also consider the RBL tests.
After making all these changes, I then proceeded with the testing as
indicated in the README.postfix
file, double checking every step,
because my machine was handling live email. Ultimately, my
/etc/postfix/master.cf
was altered to comment out three lines:
#AMAVIS# smtp inet n - - - - smtpd #AMAVIS# pickup fifo n - - 60 1 pickup #AMAVIS# cleanup unix n - - - 0 cleanup
replacing those with:
smtp inet n - - - - smtpd -o cleanup_service_name=pre-cleanup pickup fifo n - - 60 1 pickup -o cleanup_service_name=pre-cleanup cleanup unix n - - - 0 cleanup -o mime_header_checks= -o nested_header_checks= -o body_checks= -o header_checks=
And adding these new services as well:
pre-cleanup unix n - - - 0 cleanup -o virtual_alias_maps= -o canonical_maps= -o sender_canonical_maps= -o recipient_canonical_maps= -o masquerade_domains=
smtp-amavis unix - - y - 2 lmtp -o smtp_data_done_timeout=1200 -o disable_dns_lookups=yes
127.0.0.1:10025 inet n - y - - smtpd -o content_filter= -o local_recipient_maps= -o relay_recipient_maps= -o smtpd_restriction_classes= -o smtpd_client_restrictions= -o smtpd_helo_restrictions= -o smtpd_sender_restrictions= -o smtpd_recipient_restrictions=permit_mynetworks,reject -o mynetworks=127.0.0.0/8 -o strict_rfc821_envelopes=yes
I also added the startup for amavisd
to /etc/rc.local
:
if [ -x /opt/amavisd/sbin/amavisd ]; then echo -n ' amavis '; sudo -u amavis /opt/amavisd/sbin/amavisd -c /opt/amavisd/etc/amavisd.co\ nf fi;
But that's it. It's been working quite well for me, and my load average has been significantly less. Now the MIRVs get processed once instead of ten times, and all is well. Almost immediately after making this change, net connections on the box were more stable, and my system didn't suddenly spike and freeze up my Emacs session nearly as much as it formerly did. And I'm not dealing with anywhere near the volume of identical spams, so my personal mail volume is also lower.
So, consider adding Amavis to your MTA, and help fight your neverending spam battle in an efficient way. Until next time, enjoy!