Portal Home > Knowledgebase > Articles Database > How to train SpamAssassin's Bayes filter?


How to train SpamAssassin's Bayes filter?




Posted by ioiom63, 12-17-2008, 04:59 PM
How to train SpamAssassin's Bayes filter? I've been reading about SpamAssassin's Bayes filter. Apparently I need to forward examples of Spam and Ham to a specific email address which SA then checks. Unfortunately, I do not suspect my average user will do this. Here's what I'd have in a perfect world: a Webmail program and plugins for Thunderbird/Outlook/Outlook Express that will forward emails to the appropriate place whenever the Junk/Not Junk buttons are pressed. So far, the only one I've been able to find that will do this is Zimbra. However, Zimbra requires an entire server which is overkill for our small company. I realize Thunderbird already has Bayes filters built in but I would like to do all the spam filtering server-side for consistency. Is what I want possible? If not, can anyone link me to a good tutorial for writing Thunderbird extensions?

Posted by LoganNZ, 12-17-2008, 07:04 PM
Can I ask, do you have a big to medium spam problem? Are you using RBL lists at your mail server currently? Doing more at the server side will reduce the requirements of user interventions regarding spam issues. Regards, Logan __________________Server Systems Administration NZ | SSANZGot Hacked? | 24/7/365 Remote Professional Support | Affordable Server Management

Posted by Memidex, 12-18-2008, 07:39 AM
I don't know about the technical feasibility of what you're asking, but one point to consider (if you haven't already) is that Bayes filtering can be less accurate/reliable when based on the training of multiple users. What may be spam for one user may not be spam for another user (for example, an industry newsletter). You may be more likely to get false positives. Algorithmically, it's possible to take multiple users into account by weighting each "vote" to reduce the likelihood of false positives. I don't know if Bayes packages already do this. Google certainly does this with Gmail. If feasible, it may be better/easier to setup a global filter based on a single user's training -- that of an administrator. The administrator can then apply conservative training to avoid false positives. Individuals can then apply their own training on the client side to get a second level of filtering.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
How to install CURL? (Views: 797)


Language:

LoadingRetrieving latest tweet...

Back to Top Copyright © 2018 DC International LLC. - All Rights Reserved.