|
|
|||||
Spam Filtering Statistics from oreilly.comI thought readers might enjoy this message that O'Reilly sys admin chief Bob Amen just wrote on our internal mailing list:"Below is a summary of the incoming email to our gateway mail servers for all domains that we accept email for (there are 57 domains). This summary is for the last 7 days: Our mail servers accepted 1,438,909 connections, attempting to deliver 1,677,649 messages. We rejected 1,629,900 messages and accepted only 47,749 messages. That's a ratio of 1:34 accepted to rejected messages! Here is how the message rejections break down: Bad HELO syntax: 393284 The order that the rules are listed above is the order in which the rules are tested on each message. I hope you find this interesting. Consider also that this is all done with open source software running on two Linux machines. The MTA is Exim with SpamAssassin used for spam analysis and ClamAV for virus analysis. I spend less than an hour a week maintaining these two systems. That's a pretty good ROI. I'm sure you have your own similar (or worse) statistics. What a waste! (And thanks to all the developers and administrators who've make this problem much less intrusive to ordinary users, leaving it to people like Bob who have to shovel out the sh*t. Ordinary users still think spam is bad, but they don't really know just how bad....) |
|||||
|
|||||
Comments: 18
Claus [19 June 2006 06:58 AM]
If you are rejecting messages based on SORBS you most likely have lots of false positives in your spam test and you're throwing out tons of ham with your spam.
The SORBS blacklisters are very, very eager to block, have obscure testing criteria and have a policy of demanding a ransom to delist IPs.
Among other things they often (if not always?) blocks all mail coming from GMail.
Ned Baldessin [19 June 2006 07:06 AM]
Blocking email that has no subject seems harsh. There are situations (quickly sending a link while on the phone for example) where a subject isn't necessary.
adamsj [19 June 2006 07:25 AM]
Something is very, very wrong with how this post displays. Tim, you might want to check how you coded your post.
Marc Hedlund [19 June 2006 07:58 AM]
adamsj,
Boy, were you right. Thanks for the comments; I've made some fixes.
Bob Aman [19 June 2006 08:16 AM]
I've had a lot of trouble with both SORBS and SpamCop blocking my perfectly legit mail as well.
Justin Mason [19 June 2006 09:28 AM]
ick; I'll double up on those "watch out for SORBS FPs" comments. IMO, it's not something that should be used for front-line binary accept/reject decisions; instead, leave that to a scoring-based fuzzy filter like our own SpamAssassin -- good to see you're using it. ;)
The Spamhaus lists are always reliable, however...
Matt Riffle [19 June 2006 10:15 AM]
I'll go ahead and pile on -- I've seen enough problems with SORBS that I couldn't recommend actually rejecting mail at SMTP time because of it. Using it as part of SpamAssassin's suite of such checks is about as far as I'd go.
-Matt
Martin [19 June 2006 10:23 AM]
A quick me too on SORBS. As far as I know several large German free mail providers are blocked in SORBS, for no reason they can fix.
Bob Amen [19 June 2006 02:49 PM]
So far in the three or so years we've been using SORBS I've only had one false positive. I know a lot of people don't like SORBS but I've had very good results with their black list. Every time I've investigated a listing, it's always been right on. Also, we don't use their more controversial lists. We only use the dul, zombie and nomail lists:
dul.dnsbl.sorbs.net - Dynamic IP Address ranges (NOT a Dial Up list!)
nomail.rhsbl.sorbs.net - List of domain names where the owners have indicated no email should ever originate from these domains.
zombie.dnsbl.sorbs.net - List of networks hijacked from their original owners, some of which have already been used for spamming.
There are a lot of other SORBS lists that may have questionable value, such as the spam DNSBL. I believe the commenters are referring to those lists and I agree with their assessment. No email from gmail is ever blocked by our servers...unless it has a high SpamAssassin score or contains a virus.
Cheers,
Bob
Geoff Butterfield [19 June 2006 04:04 PM]
High spam levels for sure, but how does this report compare to previous weeks? Our email server has been hit by a massive directory harvest attack ( it's lasted about 8 days, all from distributed IP addresses ) and other spam sources seem to be more active as well.
Bob Aman [19 June 2006 06:05 PM]
Note to self: Do not attempt to get a job at O'Reilly. The name confusion would be horrible.
GRex [20 June 2006 07:15 AM]
Really, are we witnessing the impending death of email? With emails getting less and less credible (having your legit mail reaching spam folder), will email still be relied on the way it is now 5 years later?
Claus [20 June 2006 08:11 AM]
Good point on the differences between the different SORBS lists. However, the 'ransom' policy and inability of free retests makes them less attractive policy wise if it actually works statistically.
Bob Amen [20 June 2006 11:12 AM]
Geoff:
I don't have detailed statistics for previous weeks but I do have a fairly long term view of the CPU usage and it hasn't jumped recently. Just a slow increase over the last year.
If you're having problems with a large dictionary attack you could try what we do. If the sending mail server attempts to send to three invalid addresses, we drop the connection. They may try again, but I haven't seen that happen. Usually they just go away.
Geoff Butterfield [20 June 2006 12:46 PM]
Bob:
Good tip regarding directory harvest attacks, I use a similar policy, which makes this current attack so interesting. It's coming from multiple IP addresses - a DDHA? ( Distributed Directory Harvest Attack ). Maybe its just us getting more spam these days...
dave cormier [21 June 2006 08:37 AM]
well... their spam filter has successfully blocked every email i tried to send to o'reilly during the first run of the web20 trademark issue. Seems like success to me!
Brian [22 August 2006 07:07 PM]
We too are getting slammed with DHA's. We have a good spam filter, except that it does not touch DHA's. I wonder if I could put Exim between my spam filter and email server and just use it to drop DHA's?
Steve Hersker [31 October 2006 08:18 AM]
We just cut over to Exim/SpamAssassin/ClamAV. The stats Bob posted are excellent - pardon the newbie question, but how did he gather that info? I've played with Eximstats, but am I missing something?
Thanks!