stargeek
PHP news website logo.
home    PHP scripts    articles    seo tools    links    search    contact    shop    realtors


Using Bloom Filters







Using Bloom Filters

Using Bloom Filters 04/09/2004 04:00 PM

Perl hashes make set membership easy at the cost of memory usage. A lesser-known technique, Bloom filters, trades a tunable false-positive rate for compactness -- and has interesting applications for privacy concerns. Maciej Ceglowski explains the theory and practice of Bloom filters.




This is a GrokNews Entry: (what is grok?)





Similar Items

Using Bloom Filters

Grok Headline matches for Using Bloom Filters

Bloom Filters


Bloom Filters 04/28/2004 12:14 AM

Recent posts about LOAF, which uses Bloom filter, created a small surge of discussion about bloom filters, most notable being the Using Bloom Filters article at Perl.com by Maciej Ceglowski whom I like to remember as the fish guy (visit his blog to see why).

I went fishing for some bloom filter code but couldn't find a general library in either Java or C++.  There was one for Perl but...  Anyhow, it's probably because there isn't much code needed.  Most of the Bloom filter works is finetuning the parameters and choosing the right hashing function so it doesn't really matter.

Beside Maciej's article, I found these pages useful:

BF is pretty simple stuff but useful in many areas.  I am thinking of using it to detect 'access devices' (user name, password, SSN, credit card numbers, etc.) being submitted translucently (translucent as in Translucent Database) so I can throw up a dialog warning to the user.


Spam-filters bounce email about
spam-filters


Spam-filters bounce email about
spam-filters
01/19/2004 01:55 PM
Spam-filters are trained to catch anything that looks like spam, including discussions of spam and spam-filters.
"Patterns of e-mail use are definitely being impacted both by spam and by antispam filters," said Craig Hughes, chief architect at McAfee Security and co-developer of the open-source SpamAssassin spam-filter project.

"I myself run into the problem all the time, mainly because what I'm corresponding about frequently involves discussions of spam or particular spam strategies..."

"I've lost count of the number of e-mails that get bounced back thanks to spam filters getting triggered by completely innocent words and phrases," said Suresh Ramasubramanian, head of security and antispam operations for managed mail-services firm Outblaze.

Link

"What bloom are you"


"What bloom are you" 08/10/2004 08:42 AM

Bloom-Filter-0.03


Bloom-Filter-0.03 07/21/2004 01:01 AM

How Gary Bloom


How Gary Bloom 07/10/2004 12:44 PM
ZDNet Jul 10 2004 4:32PM GMT

Bloom-Filter-0.02


Bloom-Filter-0.02 04/21/2004 05:11 PM

Is Lilly About to Bloom?


Is Lilly About to Bloom? 04/18/2005 02:43 PM
Two new diabetes products could add a little punch to this drugmaker's top line.

Let a Thousand Reactors Bloom


Let a Thousand Reactors Bloom 09/02/2004 04:38 AM
Explosive growth has made the People's Republic of China the most power-hungry nation on Earth. Get ready for the mass-produced, meltdown-proof future of nuclear energy. By Spencer Reiss from Wired magazine.

The Bloom Is Off the ASCO Rose


The Bloom Is Off the ASCO Rose 06/11/2004 02:02 AM
Business Week Jun 11 2004 6:06AM GMT

Space Artistry in Bloom


Space Artistry in Bloom 07/09/2004 04:44 AM
Artist Martin Naroznik has a vision to boldly grow where no one has grown before, and NASA finds it fascinating. By Mark Baard.

Chip sales bloom in June


Chip sales bloom in June 08/03/2004 04:21 PM
CNET News.com Aug 3 2004 8:00PM GMT

Veritas CEO Gary Bloom Unplugged


Veritas CEO Gary Bloom Unplugged 07/08/2004 10:42 AM
ZDNet Jul 8 2004 1:58PM GMT

Bloom County on Rather, back in 1984


Bloom County on Rather, back in 1984 09/17/2004 12:58 AM
Bloom County called this one .. CBS

mournival.com/2004/09/found-my-old-bloom-county-books.html
track this site | 4 links


Stinky Flower Set to Bloom After 60
Years (AP)


Stinky Flower Set to Bloom After 60
Years (AP)
06/24/2004 08:14 AM
AP - A giant exotic plant that has not bloomed in the Northeast in more than 60 years is ready to flower at the University of Connecticut's greenhouses. The "corpse flower" has the odor of 3-day-old road kill, and UConn botanists couldn't be more excited.

Spring bloom for property market


Spring bloom for property market 03/28/2005 06:01 AM
UK house prices ease a touch in March, encouraging more house hunters back to the market, the latest Hometrack survey says.

Let a thousand conspiracy theories bloom


Let a thousand conspiracy theories bloom 12/17/2004 06:33 PM

I'm about to hit the sack, but current indications are that Bush has won Ohio by a couple of percentage points and thus has been re-elected as President of the United States.

Ohio. Isn't that the state that Diebold president Walden O'Dell promised to deliver to the Republicans?

I don't know if Ohio voters used Diebold machines. If they did, I'm certainly not about to say that the machines were fixed in any way. But the problem with voting machines without a paper trail is that there's no way anyone can be absolutely certain that the election wasn't stolen. In a modern democracy, that just ain't healthy.


Let a Million Videos Bloom Online


Let a Million Videos Bloom Online 12/31/2004 06:44 AM
Let a Million Videos Bloom Online .. este artigo

businessweek.com/bwdaily/dnflash/dec2004/nf20041229_0845_db01 6.htm?campaign_id=rss_daily
track this site | 3 links


Orlando Bloom is king of Google


Orlando Bloom is king of Google 12/26/2004 07:28 AM
Article.wn.com - Sun Dec 26, 02:45 am GMT

Stinky Flower Set to Bloom After 60
Years


Stinky Flower Set to Bloom After 60
Years
06/24/2004 01:30 AM
Abcnews.go.com - Wed Jun 23, 07:37 pm GMT

Inter-Korean Relations Bloom in
Cyberspace


Inter-Korean Relations Bloom in
Cyberspace
07/05/2004 04:42 AM
Hankooki Jul 5 2004 9:03AM GMT

Let A Million Pirate Radio Stations
Bloom!


Let A Million Pirate Radio Stations
Bloom!
05/07/2004 01:28 PM
Pirate radio stations are nothing new, but one guy is now trying to train more people in how to set up their own pirate radio station (with the more politically safe sounding name: microbroadcasting), with the idea of creating a (radio) wave of civil disobedience about how the FCC allocates radio licenses. Of course, plenty of radio broadcasters aren't too thrilled about this idea and are fighting heavily against it. The pirate stations are mostly breaking the law, and it's unlikely enough of them will show up to make a major difference. Besides, you have to wonder why it's worth bothering with radio anymore. Why not just set up a station online?

Technology sales to bloom in 2004 - EU
study


Technology sales to bloom in 2004 - EU
study
02/19/2004 12:41 PM
Looking good for IT and telecoms

How Gary Bloom pilots Veritas past
utility titans


How Gary Bloom pilots Veritas past
utility titans
07/08/2004 10:42 AM
ZDNet Jul 8 2004 1:58PM GMT

WORLD: Bloom of the bl0g marks
internet's rising tide of influence


WORLD: Bloom of the bl0g marks
internet's rising tide of influence
12/29/2004 08:44 PM
Asia Media Dec 29 2004 11:58PM GMT

How .Mac filters spam


How .Mac filters spam 05/14/2004 01:37 PM
A recent knowledge base entry describes in fairly good detail how spam is filtered from .Mac accounts. According to Apple uses software from Brightmail along with list-based filtering. Apparently Apple also does something on their own:
.Mac also monitors all incoming message activity for trends. This information can reveal a previously unknown source of spam when they begin to send mail to members of the .Mac community.
It's an interesting look behind .Mac. For our .Mac using readers: how good is the spam blocking?

DeGan Filters


DeGan Filters 04/27/2004 11:49 AM

De Gan Filters. The good Mr. Canter put me in touch with Joel De Gan, who's working on the pDNS (a system modelled... [Raw]

I'll spare my readers the details - but if you're really nerdy, a math kind of gal or guy or in general wanna see the results of great open collaborative work - check it out.  And continue to check out PeoplesDNS.com.

The idea here is that ideas and their implementation turn into money.  Not the other way around.  This is what I wished VCs and investors understood that better.  That it's NOT about money.

Sure making money is important, but the ideas are more important.  The ideas change the world, money buys bombs and war.  Ideas make the world a better place, money just keeps the rich staying richer.  But we software gals and guys like to get paid - so we have to work for money.  But there's always working for ourselves - too.

I just wish soem of those 'rich guys' (i.e. billionaire entreprenuers) would throw a little cash (like a $200M) at some open source projects and see what happens. I bet allot more than the next John Doerr investment.

One day the software industry is gonna wake up and find that we don't need their VCs anymore.  But we will need lots of enterprise salespeople.  That's what VCs will turn into (what they REALLY are right now.)

Have to I told you much I like hanging out with really smart people?

Here's Joel's explanation of the De Gan filter:

Tue Apr 27 9:03:06 CDT 2004
I was introduced to
Danny via Marc Canter who I am doing some opensource work with, I am creating a peoplesdns for use with FOAF files for the peepagg project. Danny thought I may make good use of an old form hash filter created in the early 70's called a blo om filter (named after it's late creator Burton Bloom). I realized that this filter was amazing on smaller data sets but needed something to be able to handle huge sets so I created the De Gan filter (following the naming convention) which can hold huge datasets in a one-way hash matrix. It is not efficient with smaller datasets, but with larger datasets it is incrementally more efficient in space than the binary bloom filter.

For instance, DNS can be held in around 75M in a bloom filter, with the De Gan filter it can be held in around 20M With larger datasets this continues (a 250M bloom filter could be held in a 65M De Gan filter and so forth)

Dannys blog entry about the filters

Joel De Gan and Danny Ayers are dudes that I knew would hit it off.  And these Drupal dudes are happening as well.  All that was needed was a little Type-A megalomaniac, visionary thang - that's where I come in.  Everybody's got a role - you line up the right team and magic can happen.

OK - back to testing the world's first true DLA.


Dueling Filters


Dueling Filters 09/23/2004 08:29 AM
Here's a message I received this morning: This notification has been sent to inform you that a message has been quarantined by InterScan MSS for SMTP. Subject: Blog draft Rule: Incoming Policy Filter: CONTENT FILTER Problem:Filter Type: Advanced Content Filter Event: at MAILBODY: CONTENT , "shit" violated Action on Attachment: NOT MODIFY Action: Quarantine So, this message quotes the word that triggered the quarantine. Apparently, then, there are uses of this word that are acceptable... ...unless, of course, my own spam filter were set to protect my virgin ears from such filth, in which case it would have sent it's...

Pipes and filters


Pipes and filters 06/17/2005 06:45 PM
I still remember the day, many years ago, when a wise old programmer looked over my shoulder and said, "Ah, Grasshopper, you need a pipe!" and so set me on the path to true enlightenment.

GNU Talk Filters 2.3.2


GNU Talk Filters 2.3.2 06/27/2004 06:24 AM
A collection of humorous text translators.

Working with Filters


Working with Filters 06/18/2004 04:11 PM
See how you can use filters to help users easily retrieve the Breeze content they need.

Networks-n-Filters


Networks-n-Filters 08/20/2004 11:55 PM
New Release: Networks-n-Filters-1.1.0

GNU Talk Filters 2.2


GNU Talk Filters 2.2 12/03/2003 02:40 PM
A collection of humorous text translators.

GNU Talk Filters 2.3


GNU Talk Filters 2.3 12/07/2003 10:53 PM
A collection of humorous text translators.

Who Needs Art When You Have Photoshop
Filters?


Who Needs Art When You Have Photoshop
Filters?
08/27/2004 02:00 PM
Roland Piquepaille writes "Computer scientists from the University of Bath have written a software which transforms your ordinary photographs and movies into cubist works of art and animation reminiscent of Picasso. They trained their software to identify important elements of a face, such as a nose, eye or mouth, until the computer learned how to recognize them on its own. This was achieved by giving the software a kind of 'aesthetic sense.' Then, by "using photographs of a subject taken from multiple points of view, the software automatically picks out important areas within the image, which are cut out as chunks. The chunks are statistically shuffled and a few of them randomly selected and distorted into a 'cubist' composition ready for digital painting." The software is not yet publicly available, but software and animation companies have expressed interest. My blog contains additional references and images."

Internet Filters Are: [Good] [Bad]
[Both]


Internet Filters Are: [Good] [Bad]
[Both]
07/03/2004 03:23 PM
For all the fuss, filters alone may never prove to be the solution to keeping smut away from young Internet users.

Internet Filters? Sometimes Good,
Sometimes Bad


Internet Filters? Sometimes Good,
Sometimes Bad
07/06/2004 03:14 AM
The NY Times is making fun of both the ACLU and the government for what appears to be contradictory positions on internet filters. They point to two big cases in the past two years, where each side appears to take an opposing viewpoint. In the recent ruling on porn blocking, the ACLU argued that filters were fine and we don't need a law to stop online porn. The government argued the opposite viewpoint, saying filters were not effective by themselves. However, last year, the ACLU argued that filters were a terrible way to stop porn in the case over whether or not libraries could be required to put internet filters in place in order to receive federal funds. However, when you look at the details, the positions aren't as contradictory as the NY Times would like you to believe. The ACLU believes that no one should be forced by the government into using filters. However, if someone (or some organization) decides to use them on their own, that's perfectly reasonable. The government's position is just as internally consistent as well -- looking for any way to stop children from accessing porn. The difference is that the two sides disagree on how effective the means are, and whether or not they block out certain other constitutional rights.

Spam Filters & .NET 2003 COM Add-Ins


Spam Filters & .NET 2003 COM Add-Ins 05/19/2004 05:39 PM
DDJ May 19 2004 9:02PM GMT

Porn Filters For Adults


Porn Filters For Adults 04/16/2004 03:29 AM
The official target market for internet porn filters tends to be parents who want to make sure their children don't run across porn sites on the web (purposely or by accident). However, according to this Salon article, an increasing number of sales are going to adults who want to keep their spouses from accessing porn. There certainly have been plenty of stories about online porn or "online infidelity" leading to the breakup of marriages, but I do wonder whether or not placing a filter on the computer is just a way of whitewashing over a deeper issue. Obviously, if porn is getting in the way of a relationship, then there are bigger issues than just access to porn.

How to skirt filters when spamming


How to skirt filters when spamming 06/06/2005 12:06 AM

Grok Description matches for Using Bloom Filters
GrokA matches for Using Bloom Filters

Using Bloom Filters

The following phrases have been identified by the grok system as matching this entry:

















Also check out:


Grok

Ipod Porn on the
Rise

Brief Abstract of
Wikipedia's
Mesothelioma Cancer
page

Get first aid
instructions in your
cell phone

IE is crap
JSPWiki gains
podcasting support

Dasani Tapped Out of
Europe

DHB Industries, Inc.
Shorts Roost at
Children's Place

VoIP: Why the Light
Touch?

Laidlaw's Bumpy Ride
Rite Aid Wrings Out
Profits

A Bet on Taser
Dell Springs Forward
CA Executives Make
the Walk

Yippee for Yahoo!
NewsGator 2.0
service release for
Office XP SP3

Southland's Census
Story, in a Word:
Boom! (Los Angeles
Times)

Curbs on Outside
Deals at NIH Urged
(Los Angeles Times)

A Passion for
Rivalry in Brazil
(Los Angeles Times)

One Year Later:Where
Is Iraq? (Los
Angeles Times)

Testimony Paints
Image of Passive
Inner Circle (Los
Angeles Times)

Panel's partisan
divide exposed
(USATODAY.com)

Panel's focus falls
on Aug. 6 Bush brief
(USATODAY.com)

Ancient burial may
mark first human-cat
friendship
(USATODAY.com)

Fastest-growing
title more headache
than honor
(USATODAY.com)

Rice: No 'silver
bullet' on 9/11
(USATODAY.com)

Rotation Reassessed
as Toll Spikes
(washingtonpost.com)

Anti-U.S. Uprising
Widens in Iraq;
Marines Push Deeper
Into Fallujah
(washingtonpost.com)

Marines Try to Quell
'a Hotbed of
Resistance'
(washingtonpost.com)

Zeroing In on One
Classified Document
(washingtonpost.com)

General May Bolster
Force in Iraq;
Militias Kidnap a
Dozen Foreigners
(washingtonpost.com)

Kerry Says Bush
Makes U.S. Job in
Iraq 'Lot Tougher'
(Reuters)

Iraqi Insurgents Say
Seize Six Foreigners
(Reuters)

At Least 2 U.S.
Soldiers Killed in
Iraq on Friday
(Reuters)

White House Works to
Declassify Al Qaeda
Threat Memo
(Reuters)

Iraq in Turmoil on
Anniversary of
Saddam's Fall
(Reuters)

Golfer Rose Has
2-Stroke Lead at
Masters (AP)

Janet Jackson to Be
Live on NBC's SNL
(AP)

Pope Leads Good
Friday Procession
Prayer (AP)

Police Take Former
Enron Exec to
Hospital (AP)

Sharon Calls for
Party Vote on
Withdrawal (AP)

Gore Meets Privately
With 9/11 Commission
(AP)

Expert May Help
Decide Neb. Abortion
Case (AP)

AP Poll: Bush, Kerry
Still in Close Fight
(AP)

Iraq Abduction Puts
Japan Gov't in
Crisis (AP)

U.S. Troops Retake
Most of Key Iraqi
City (AP)

BLOGGING AND
PERSONALITY
CHANGE

YOUR
LAW

HOW WE
LEARN, AND WHY WE
DON'T

WHEN WILL
THEY EVER
LEARN?

CAMERA'S
ROLLING

CENTRALIZE/D
ECENTRALIZE

GIVING
BACK

THE SALON
BLOG SILLY EASTER
EGG HUNT

MacUser Review:
SuperCard 4.1.2

what is grok?