Bayesian spam filtering for the masses
Grok Headline matches for Bayesian spam filtering for the masses
[OT] Safe spam filtering methods (was:
Is predictable spam filtering a
vulnerability?)
[OT] Safe spam filtering methods (was:
Is predictable spam filtering a
vulnerability?)
06/22/2004 11:56 PMThe Fungi (Jun 20 2004)
Bayesian Pattern Filtering Library
0.1.0alpha
Bayesian Pattern Filtering Library
0.1.0alpha
03/13/2003 06:02 PMA C++ library for building Bayesian Filters.
Microsoft calls for outbound spam
filtering against spam
Microsoft calls for outbound spam
filtering against spam
06/04/2004 10:42 AMComputer Weekly Jun 4 2004 2:14PM GMT
E-texts used against Bayesian
spam-filters
E-texts used against Bayesian
spam-filters
12/02/2003 07:37 AMBayesian anti-spam filters count word-frequency in suspect and compare
the results to profiles of word-frequency in spam and ham. Defeating
this requires that your spam include a lot of natural human prose. So
spammers have started to mine the Gutenberg Project and other sources
of human-generated ASCII and dumping random hunks of literature into
their messages to get around the filters.
Blogger and journalist Clive Thompson found an excerpt from Chapter 20
of The Master Key by Wizard of Oz author L Frank Baum in a message
that had as its subject line "the big unit" (no prizes for guessing
what the rest of it was hawking).
Linkbogofilter -- Fast Bayesian Spam Filter
bogofilter -- Fast Bayesian Spam Filter
03/16/2003 01:31 PMbogofilter-0.11.1.3 - new stable release
SpamProbe - fast bayesian spam filter
SpamProbe - fast bayesian spam filter
03/27/2005 10:08 AMspamprobe-1.1x6 released
Bayesian spam rumination: when
word-frequency-histograms attack!
Bayesian spam rumination: when
word-frequency-histograms attack!
06/29/2004 10:40 AMEd Felten has posted an intriguing rumination on the possible failure
modes of Bayesian spam-filtering -- filtering that uses word-frequency
statistics to classify email as spam or ham. As Ed points out,
Bayesian filters are trained by the spammers, who, by choosing the
vocabulary of their messages carefully, can make messages containing
certain words or phrases undeliverable on the Internet.
Now suppose a big spammer wanted to poison a particular word, so that
messages containing that word would be (mis)classified as spam. The
spammer could sprinkle the target word throughout the word salad in
his outgoing spam messages. When users classified those messages as
spam, the targeted word would develop a negative score in the users'
Bayesian spam filters. Later, messages with the targeted word would
likely be mistaken for spam.
This attack could even be carried out against a particular targeted
user. By feeding that user a steady diet of spam (or pseudo-spam)
containing the target word, a malicious person could build up a highly
negative score for that word in the targeted user's filter.
LinkPre-Filtering The Spam
Pre-Filtering The Spam
06/29/2004 03:54 PMMy anti-spam system now uses a variety of server and client side
filters to help keep the damn stuff out of the inbox. Now, some are
suggesting an even earlier level "pre-filter" for spam. HP Labs has
made a fairly simple discovery that even without
various
email authentication systems, it's pretty easy to get a quick
determination as to whether or not an email is spam. The system they
developed looks at whether or not the server sending you the email
normally sends good emails or sends spam. With that one
determination, they can
properly pre-classify emails at a pretty high success
rate. It's not a replacement for a spam filter at all. They know
it's not that good. However, what it can do is do a pre-sort for
prioritization purposes -- so that good emails tend to make it through
the real spam filters faster. In a number of ways, it's pretty sad
that we now need "quality of service" setups for our email.
ISP Hesitate Over Spam Filtering
ISP Hesitate Over Spam Filtering
06/03/2004 02:24 AMAs the spam battle wages on, most of the focus is on end-users and law
enforcement. Not too many people seem to focus on the role of ISPs,
who
sometimes do take a more proactive role in stopping spam.
The problem, though, is that when the ISP filters spam, they often
run into issues with false positives. If the filters are too loose
(to avoid false positives) then too much spam gets through, and users
are upset. If the filters are too tight, important messages go
missing, and users are upset. Many ISPs are realizing, at the very
least, they need to let the end-user have access to the spam folder,
so they can occasionally sort through it for false positives - but
very few users ever bother to look through it. Some ISPs don't offer
any kind of filtering at all, claiming that they don't see how to make
money off of it - which seems especially short-sighted. If they can
offer sufficient spam filtering, they're much more likely to keep
customers than if they simply let everything through when customers
are looking to their providers to provide protection from the
onslaught of spam. No matter what, it's becoming clear that the spam
fight needs to be approached from various angles, and many customers
are likely to bail out on ISPs that don't at least offer a spam
filtering option.
New Method of Spam Filtering
New Method of Spam Filtering
02/19/2004 02:06 PMSpam filtering, the next chapter
Spam filtering, the next chapter
05/24/2004 09:17 AMI've been experiencing good results filtering out spam with a
combination of Popfile (a Bayesian classifier) and the built-in filter
in Eudora 6. I get well over 1,000 spams a day, so I need
accuracy both in identifying spams and avoiding false positives with
legitimate mail.
The biggest problem with my setup is that it all runs on the client
side. Popfile works as a transparent proxy that runs on my
Windows machine. I don't see the spam in my inbox, but I still
have to download it before it can be filtered. As the spam
volumes have increased, that has become an increasingly significant
burden. Every check pulls down scores of messages, most of which
wind up in the trash. I've had several cases where the sheer
numbers crash Eudora. Getting email through my Treo is basically a
waste of time, because it doesn't have the filters. If I'm on
the
road and don't check my mail, there are thousands of messages waiting
for me when I get back.
I finally had time this weekend to set up filtering on the server
side. Werbach.com and my other domains run through a Web hosting
provider, Pair Networks, which offers a version of SpamAssassin.
The tricky part was configuring it to automatically filter or delete
messages, using procmail, rather than just putting something in the
email header for later processing on the client.
I think I have it working now. I'm using SpamAssassin on a
forgiving setting, because the client-side filters are still running
after the mail goes through. If I can just weed out 60% of my
spams before they reach my machine, life would be much better.
So
far, it looks like I can do significantly better than that.
I'm still tweaking the set-up, so it's possible some legitimate email
will get stuck in the filters. If you write to me and don't get
a
response for a while, please try again.
Using AI for Spam Filtering (w/ Source
Code)
Using AI for Spam Filtering (w/ Source
Code)
07/11/2004 09:20 AMSpam filtering with a human touch
Spam filtering with a human touch
09/21/2004 01:11 PMOne company is offering a novel solution to the problem of spam. Will
spam filtering done by humans be a hit?
Re: Is predictable spam filtering a
vulnerability?
Re: Is predictable spam filtering a
vulnerability?
06/18/2004 01:01 PMJoel Eriksson (Jun 17 2004)
Verizon sued over spam filtering
Verizon sued over spam filtering
02/01/2005 08:53 PMUpset Verizon customers have filed a class-action lawsuit over the
telco's aggressive spam filtering. Verizon's blacklists are allegedly
blocking all mail from some countries.
A Unique Approach to Spam Filtering
A Unique Approach to Spam Filtering
07/06/2004 03:03 AMFrontgate MX brings a New Level of Simplicity to Personal E-mail
Protection With a Unique "Single Step" Approach to Spam Filtering.
[PRWEB Jul 6, 2004]
Human-Powered Spam Filtering
Human-Powered Spam Filtering
09/20/2004 10:28 AMIs predictable spam filtering a
vulnerability?
Is predictable spam filtering a
vulnerability?
06/17/2004 03:44 AMR Armiento (Jun 16 2004)
I thought that our spam filtering had
suddenly got...
I thought that our spam filtering had
suddenly got...
12/29/2003 10:31 PMI thought that our spam filtering had suddenly gotten way better, but
turns out, my pyra.com mail just started bouncing instead of
forwarding. Bummer. If you tried to reach me at pyra.com try doing so
at google.com. Or just wait (upon DNS updating, it should be fixed).
Extreme Spam Filtering – When Filters
and Blacklists Are Not Enough.
Extreme Spam Filtering – When Filters
and Blacklists Are Not Enough.
02/07/2005 01:05 AMProtect Multiple POP, Yahoo, Hotmail, Gmail, or IMAP E-mail Accounts
from Spammers with 0Spam.com. Compatible with all E-mail clients and
operating systems. [PRWEB Feb 6, 2005]
Mailsmith gets server-side spam
filtering, more
Mailsmith gets server-side spam
filtering, more
07/21/2004 11:18 AMBare Bones Software today announced the release of Mailsmith 2.1.2,
the latest version of its powerful e-mail client...
Microsoft pushes spam-filtering
technology
Microsoft pushes spam-filtering
technology
06/24/2005 03:25 PMZDNet Jun 23 2005 2:00AM GMT
Microsoft calls for outbound filtering
against spam
Microsoft calls for outbound filtering
against spam
06/04/2004 07:29 AMSAN JOSE, California -- In its continuing fight against unsolicited
commercial e-mail, Microsoft Corp. plans to filter outgoing messages
on its consumer mail services and is busy developing new "proofing"
technologies, the software maker's chief spam fighter said Thursday.
Notice to customers using e-mail
filtering "SPAM" software
Notice to customers using e-mail
filtering "SPAM" software
11/15/2003 11:03 AM
...PowerMail's user interface, spam
filtering updated
PowerMail's user interface, spam
filtering updated
05/24/2004 09:10 AMCTM Development has released PowerMail 5.0, a major upgrade of the
Mac OS X mail client...
Mailsmith 2.1.2 adds server-side spam
filtering, more
Mailsmith 2.1.2 adds server-side spam
filtering, more
07/21/2004 11:12 AMBare Bones Software Inc. on Wednesday released an update to
Mailsmi
th, their e-mail client for Mac OS X. New features in this release
include support for server-side spam filtering, the ability to process
incoming messages with Unix tools during download, and new preferences
and interface enhancements.
Eudora 6.0: E-Mail Favorite Gets
Built-In Spam Filtering But Still Shows
Its Age
Eudora 6.0: E-Mail Favorite Gets
Built-In Spam Filtering But Still Shows
Its Age
12/19/2003 11:32 AMEudora is an undeniably powerful product. It's fast -- especially when
searching thousands of archived messages -- and quite flexible once
you take the time to learn its quirks. Its new spam-filtering features
are first-rate, especially since they support third-party
spam-filtering tools. By Jason Snell (Macworld via MyAppleMenu)
Re: Is predictable spam filtering a
vulnerability? (silently dropping
messages)
Re: Is predictable spam filtering a
vulnerability? (silently dropping
messages)
06/22/2004 08:18 PMMartin Mačok (Jun 22 2004)
Re: Is predictable spam filtering a
vulnerability? (silently drop ping
messages)
Re: Is predictable spam filtering a
vulnerability? (silently drop ping
messages)
06/24/2004 04:28 PMStephen Warren (Jun 24 2004)
Spam, spam, spam, spam ... Canada
targets unwanted email (AFP)
Spam, spam, spam, spam ... Canada
targets unwanted email (AFP)
05/12/2004 04:17 AMAFP - Canada unveiled a new action plan to combat unsolicited
commercial e-mail, nicknamed spam, which jams inboxes and clogs
Internet traffic worldwide.
Java Bayesian
Java Bayesian
04/01/2005 06:56 AM
Is there a decent open source Java Bayesian package that is not GPL
or similarly restricted
from commercial use? I am aware of only Classifier4J.
Preferably, it should be optimized for server applications and high
performance.

Bayesian Aggregator
Bayesian Aggregator
12/02/2003 08:47 AM In a comment, Kevin Jordan writes: 348North News is a normal
aggregator in much of the way you think of it. However, it allows me
to identify keywords or themes that it puts together into phrases
— and then matches up the phrases with like articles. Like a
cross between Google News and Daypop (but that makes it sound much
more complex than it is). If you want to see an "interests" based
summary for me, check out the Phrase Index. I use fairly general
keywords so as not to miss out on the future items. I haven't tried...
Subconsciously, People may be Bayesian
Subconsciously, People may be Bayesian
01/22/2004 02:48 AM DAVID LEONHARDT writes in a NY Times article about people playing
the odds of everyday life with Bayesian Analysis. He describes new
research, recently published in Nature, "which stands out because it
offers a detailed window into how the Bayesian thought process works,
showing the point when uncertainty becomes great enough to give past
experience an edge over current observation." Bayesian Analysis, among
researchers, is "the combining of new information with conventional
wisdom." I do agree about their reliance on past observations, but I
believe that they have underestimated the role of future orientation
in the whole mix of decision making.
Bayesian Aggregation, Part I
Bayesian Aggregation, Part I
02/14/2003 03:23 PMOn Monday I configured Scenario 3 of my Bayesian Aggregation
experiment, building a "good" corpus of my weblog entries and...
Bayesian Filter Library
Bayesian Filter Library
03/13/2003 11:34 AM0.1.0alpha release
Working with Bayesian Categorizers
Working with Bayesian Categorizers
11/19/2003 08:11 PMBayesian classification has proved a powerful weapon against spam. Jon
Udell tries to find out whether it can be put to use in other spheres
of content categorization.
Bayesian Aggregation, Part II
Bayesian Aggregation, Part II
02/23/2003 05:22 PMiFile finally classified something as belonging on my weblog, but I
have no idea why... Justin Rudd's Busy weekend complicated...
Working with Bayesian categorizers
Working with Bayesian categorizers
12/02/2003 01:38 AM
There's been some discussion in the blog world about using a Bayesian
categorizer to enable a person to discriminate along various
interest/non-interest axes. I took a run at this recently and,
although my experiments haven't been wildly successful, I want to
report them because I think the idea may have merit. [Full story: O'Reilly
Network: Working with Bayesian Categorizers]
This month's O'Reilly Network column was a struggle because
categorization itself is a struggle. I remain convinced that the
automated classifiers that are doing such a good job beating back the
tide of spam will also turn out to be more generally useful. But
finding the right synergy between an automated assistant and a human
overseer is a subtle and tricky thing.
...Devshed: Implement Bayesian Inference
with PHP
Devshed: Implement Bayesian Inference
with PHP
01/06/2005 09:24 AMDevShed has a new article posted
today for all of those interested in a better way to filter out
information/spam messages from your data -
Implement Bayesian inference using PHP, Part 1 .
Grok Description matches for Bayesian spam filtering for the masses
GrokA matches for Bayesian spam filtering for the masses
Bayesian spam filtering for the masses