Print

Print


David A Gift wrote:
> I'm not the expert on this and someone from the mail.msu team who is 
> needs to jump in, but it is my understanding that (a) we have been 
> using SpamAssassin on mail.msu since day-1,
I haven't been on the mail.msu.edu team since day one... however, I have 
worked here since '98 in some capacity and I recall there always being 
the option to filter spam since pilot.msu.edu became mail.msu.edu (if 
I'm wrong I'd ask that some of the original admins step in and correct 
me because most of them still work here).
> (b) spam filtering is turned off by default and most users have NOT 
> turned it on, 
I don't think we've run the numbers on this in quite awhile, but it is 
true that spam filtering is turned off by default (this had to do with 
resources and policy questions).  The best I can recall is the 
percentage of people using our default SpamAssassin rules (not including 
Mail Filter Rules) is much lower than *I'd* like to see.  I wouldn't 
even want to make up a number at this point though.
> (c) our spam-threshold settings are somewhat conservative  --  i.e., 
> we tend toward more false-negatives (letting spam through) to avoid 
> too many false-positives (trapping content that people actually want 
> to receive).  In fact, SpamAssassin is only one of several layers of 
> spam filtering deployed in mail.msu.
We have our threshold set to a score of 5.0, now numerous factors 
contribute to the score.  For example an MLUI, or Michigan Land Use 
Institute (who consistently gets marked as spam) shows this as their 
scoring:

X-Spam-Report:
    * 2.7 DEAR_FRIEND BODY: Dear Friend? That's not very dear!
    * 1.5 HTML_IMAGE_ONLY_28 BODY: HTML: images with 2400-2800 bytes of 
words
    * 0.0 HTML_MESSAGE BODY: HTML included in message
    * 1.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts

Every single newsletter they send started off with "Dear Friend"  Well 
right away, boom!  That's 2.7 points against your spam score.  The 
message was only sent via HTML, there's another 1.7 points.  Including 
the embedded image in the HTML another 1.5 points.

Now unfortunately this isn't spam, this is a legit newsletter that 
someone on campus here was having trouble receiving.  The way around 
this was to setup a list of "Trusted Senders" within the confines of 
your MSU Webmail preferences.  The trusted sender's list had some small 
issues of its own, but they were not very common and not worth 
mentioning here.

So I'd say that, yes, a score of 5.0 is fairly conservative considering 
how low a point value other attributes can be.  Lets say for example, 
that we changed the threshold to 7.0.  I'd guess that easily you'd see a 
10% increase in spam (to those who have filtering turned on).  You could 
lower the threshold to something like 2.5, but then we'd have to assign 
lower point values ourselves to things like "HTML ONLY BODY" in order to 
avoid so many false positives.

We do have other methods in place for fighting spam.  We have our 
greylisting server in front of all inbound @msu.edu email, which has cut 
spam in half across the @msu.edu domain.  We have some extra definitions 
included in our Anti-virus software that help in fighting known phishing 
attacks or other (419) scams.  We have mail filters, which can be 
difficult to create and manage but do provide some additional filtering 
based on rules you provide.  And of course we have a Blocked Senders 
List for those annoying people who are sending threatening emails to 
you, and the antonym of that, the Trusted Senders List (Do not be 
confused, blocked senders is typically not a method to fight spam).

> Point C is important:  the broader the user base the more conservative 
> spam threshold settings need to be to make sure that people get the 
> mail they want to get.  The more local and narrowly-defined the user 
> base, the more specific the spam settings can become and the more 
> effective the filtering.  If one sets filtering just for oneself, it 
> could be made to work almost perfectly, but few other people would 
> accept the same definition of perfection.  I think this 
> spam-management trade-off issue is often missed when people compare 
> the relative effectiveness of spam filtering at different levels of 
> user scale and scope.
I don't have too much to add here other than to say I couldn't agree 
with you more on this point.  :-)
> - Dave
>

Hope all this information was useful, but I will gladly try to answer 
any more questions.
./brm