Custom SpamAssassin Rules 2010-12-30
So in my previous post, I spilled the beans that I have started using Spamassassin to do some filtering for me. I had been pretty reluctant to adopt SA at first, but have come around to the idea. Here, I will show some of the custom rules I made to make my SA a little more aggressive.
To start off, SA has it's own ideas about how to score messages... and by that I mean it has default scores for a whole shitload of checks. The only thing they expect you to configure is the threshold at which a message becomes spam. The first few things I do is change some of the default scores, for example I increased the score of HTML messages from 0.1 to 0.3 - a subtle difference but it is only used to push messages over the top if they are really close to being spam.
score HTML_MESSAGE 0.3
Now this is a pretty simple example, SA does a test called HTML_MESSAGE by default (is it an HTML message?) and all this line does is tell it what score to use if the test is a hit. Other settings may require multiple lines... like this next one which first designates "en" as an acceptable language and then the second line tells SA to add 2.0 to the message score if it is not English.
ok_locales en
score CHARSET_FARAWAY 2.0
Then there are some slightly more complicated custom rules. in this next example I wanted to add my own SURBL list to SA. I should begin by explaining that SURBL, sometimes known as a URIBL is a blacklist of websites that can't be linked to inside a message. So when my mail server receives a message, it looks through the message for links to websites... if it finds any links it tries to resolve www.thewebsite.com.rbl.snork.ca - and if it gets an answer, it assumes that is a bad website (and adds 7.0 to the message score).
urirhsbl FU_SNORKBL rbl.snork.ca A
body FU_SNORKBL eval:check_uridnsbl('FU_SNORKBL')
tflags FU_SNORKBL net
score FU_SNORKBL 7.0
The first line tells it to check messages for website links, run those links against rbl.snork.ca, try to get an "A" record from the rbl server (instead of a TXT record), and finally that the name of this rule is FU_SNORKBL. I like to put a "FU_" at the front of my custom rules so I know they are from me. :-) The second line is the one that essentially tells SA to look in the body of the message to find these links. The third line tells it "This is a network test, don't run it on mass checks or if -L switch is used". And of course the last line tells it how much to score if the rule gets a hit. This website has a slightly better description of the syntax.
Here is a nice custom rule that scores a message quite high if there are links to domains that have been registered in the last 10 days.
urirhsbl FU_FRESH10 fresh10.spameatingmonkey.net A
body FU_FRESH10 eval:check_uridnsbl('FU_FRESH10')
tflags FU_FRESH10 net
score FU_FRESH10 4.9
Then there are some standard blacklists that I like to add because they are frequently the first to find spammers and add them.
header FU_BARRACUDA eval:check_rbl('barracudarbl', 'bb.barracudacentral.org.')
score FU_BARRACUDA 3.0
tflags FU_BARRACUDA net
header FU_ABUSEAT eval:check_rbl('abuseatrbl', 'cbl.abuseat.org.')
score FU_ABUSEAT 3.0
tflags FU_ABUSEAT net
header FU_SURRIEL eval:check_rbl('surrielrbl', 'psbl.surriel.com.')
score FU_SURRIEL 3.0
tflags FU_SURRIEL net
Looking through my spam (Yes, I examine my spam) I found that some of them contained some really obvious words or phrases so I added regex expressions to catch them.
body FU_PHARM /pharmacy/i
score FU_PHARM 3.5
body FU_ROLEX /rolex/i
score FU_ROLEX 3.5
body FU_VIAGRA /viagra/i
score FU_VIAGRA 3.5
body FU_RACKED /I just racked/i
score FU_RACKED 2.0
body FU_PULLIN /I pulled in/i
score FU_PULLIN 2.0
body FU_CAREER /career/i
score FU_CAREER 1.0
body FU_PFIZER /pfizer/i
score FU_PFIZER 4.5
Here I add 2.0 to the message score if it is coming from a hotmail.com address. This may seem stupid at first, but you have to consider that if the message is coming from hotmail.com, it probably isn't going to trigger too many other rules, so this is safer than it might look.
header FU_HOTMAIL From =~ /hotmail\.com/
score FU_HOTMAIL 2.0
I have seen comments from SA admins who say that you should NOT try to block messages that are supposedly being sent from an application called "The Bat!"... They claim that The Bat! is a legitimate email application, and they point out that SA has built in abilities to detect forged headers which spoof The Bat! Frankly, I don't know anyone who uses The Bat! - and if you know anyone who does, you might want to suggest to them that better (and cheaper) email clients exist.
header FU_THEBAT X-Mailer =~ /^The Bat!/
score FU_THEBAT 3.5
Now here is one I really like... I was getting spam from hotmail.com addresses and checking the header I could see an X-Originating-IP header that hotmail was adding in. Many times the originating IP was in a country that I have never been to, and in some cases, never heard of. I was kind of disappointed that hMailServer would not check this X-Originating-IP header when doing DNSBL checks. Turns out, SA does. So these rules will score messages pretty high if they are originating from countries on my list... I believe this works for Yahoo messages as well (and possibly other webmail hosts).
header __FU_NERD_CHECK eval:check_rbl('nerddk','zz.countries.nerd.dk')
header FU_NERD_AF eval:check_rbl_sub('nerddk','127.0.0.4')
score FU_NERD_AF 4.5
header FU_NERD_AR eval:check_rbl_sub('nerddk','127.0.0.32')
score FU_NERD_AR 4.5
header FU_NERD_BR eval:check_rbl_sub('nerddk','127.0.0.76')
score FU_NERD_BR 4.5
header FU_NERD_CN eval:check_rbl_sub('nerddk','127.0.0.156')
score FU_NERD_CN 4.5
header FU_NERD_HK eval:check_rbl_sub('nerddk','127.0.1.88')
score FU_NERD_HK 4.5
header FU_NERD_ID eval:check_rbl_sub('nerddk','127.0.1.104')
score FU_NERD_ID 4.5
header FU_NERD_IN eval:check_rbl_sub('nerddk','127.0.1.100')
score FU_NERD_IN 4.5
header FU_NERD_KH eval:check_rbl_sub('nerddk','127.0.0.116')
score FU_NERD_KH 4.5
header FU_NERD_KP eval:check_rbl_sub('nerddk','127.0.1.152')
score FU_NERD_KP 4.5
header FU_NERD_KR eval:check_rbl_sub('nerddk','127.0.1.154')
score FU_NERD_KR 4.5
header FU_NERD_MV eval:check_rbl_sub('nerddk','127.0.1.206')
score FU_NERD_MV 4.5
header FU_NERD_MY eval:check_rbl_sub('nerddk','127.0.1.202')
score FU_NERD_MY 4.5
header FU_NERD_NG eval:check_rbl_sub('nerddk','127.0.2.54')
score FU_NERD_NG 4.5
header FU_NERD_PH eval:check_rbl_sub('nerddk','127.0.2.96')
score FU_NERD_PH 4.5
header FU_NERD_PK eval:check_rbl_sub('nerddk','127.0.2.74')
score FU_NERD_PK 4.5
header FU_NERD_RO eval:check_rbl_sub('nerddk','127.0.2.130')
score FU_NERD_RO 4.5
header FU_NERD_RU eval:check_rbl_sub('nerddk','127.0.2.131')
score FU_NERD_RU 4.5
header FU_NERD_SG eval:check_rbl_sub('nerddk','127.0.2.190')
score FU_NERD_SG 4.5
header FU_NERD_TW eval:check_rbl_sub('nerddk','127.0.0.158')
score FU_NERD_TW 4.5
If you are interested in this one you will want to check out this site for more details.
Please post a comment if you need help with more custom rules... I may not be good at writing them but I am angry enough to keep at it 'till they work.