Reject Spam Rejection
The case for rejecting spam
The proponents of rejecting spam with an SMTP error code (5xx) while the
spammer is still connected would have you believe that this is the best
idea since sliced bread. Their argument goes something like this:
-
It immediately tells the sender that you've rejected their email.
When the sender is not a spammer, you want the sender to be notified
somehow that their message didn't get through and why.
-
Spammers sometimes use the rejections to clean up their address
lists. This means you will receive less spam in the long run.
-
It saves money.
-
Support costs are lower because users receiving email complain less if
they never see most spam.
-
By blocking spam at the network edge, your interior mail servers do not
need to be as powerful, because they're not handling as much mail (due
to the removal of most spam). In addition, servers will require less
disk space and archival and backup costs will also be lower.
-
Worker productivity. Workers spend less time spent dealing with spam
and workers are more likely to use their e-mail productively if they're
not concerned that they're going to have to wade through a pile of crap
before they can get to their legitimate messages.
-
It helps you avoid legal issues relating to spam. There may be
legal issues relating to financial scams, money laundering, creation of
a hostile working environment, etc.
I'm not 100% opposed to rejecting spam, but the idea has fundamental
flaws.
General reasons
-
Bayes training data! All that spam is great training data for Bayesian
filtering and it's specific to a particular user. And if the
spam is high-scoring spam, systems like SpamAssassin are able to
automatically learn the message as spam and better recognize similar,
but low-scoring spam.
-
Finally, SpamAssassin and the concept of spam filters get blamed for
misconfigured setups that reject mail in this way. (Well, I think
they are all misconfigured, but other misconfigurations can cause
additional false positives.)
-
SMTP 5xx - good idea with worse flaws
-
SMTP bounce messages are sent by the sending mail server.
There is no guarantee a sender of a false positive will receive a
bounce message and even less of a guarantee that they will understand
what it means or why they received it, especially for less technical
users.
-
Relying on the remote server to send a reject message gives you no
reliable/comprehensible opportunity to provide an explanation that
users can follow nor any way to provide a backdoor (like a password
bypass, a web page, or other contact information).
-
It relies on the faulty notions that spammers pay attention to error
codes, that spammers clean their address lists, and that spammers
compiling lists are the same people sending you spam.
This is the one use of spam filters that causes the most flack and
negative PR. Based on the magnitude of the flack resulting from
rejection vs. tagging or foldering, it's hard to imagine how the
total cost is lower.
Post-SMTP rejection - more worser
So, you want to send a friendly spam rejection that's easily readable.
The Catch-22 is that you can make it readable, but you can't really be
sure you're sending it to the right person.
In theory, a good way to reject spam is to accept the message at SMTP
time and reject using a user-friendly customized reply with a full
explanation and bypass mechanism of some sort. Why doesn't that work?
You do have to be very careful about using forged From: addresses since
the From: address comes from the sender, so it can't be trusted and
you'll just end up annoying people who are innocent. In theory,
checking the sending IP against the domain, using SPF, or another sender verification
method can fix this, but practically everyone gets this wrong.
I think most of the desire to reject at SMTP time comes from a gut
reaction to spam. It's what I would call the "I won't even let it
inside my network! Hehehe!" approach. However, SMTP was not designed
for this. It's a misuse of the protocol with a seriously negative
impact on usability. Maybe this can be addressed somehow in the future.
Better approaches
This also doesn't mean that all SMTP-based approaches to combating spam
are invalid. Some SMTP-based approaches are okay, I think:
-
teergrubing-style slowdowns on suspected spam (without rejection)
-
local quarantine of questionable messages, retesting them later when
additional data (bayes, distributed systems, etc.) is available.
Other flawed ideas
-
Doing a temporary failure via 4xx code because you can't trust the
sending MTA to resend.