org.apache.hadoop.util.bloom
Class BloomFilter

java.lang.Object
  extended by org.apache.hadoop.util.bloom.Filter
      extended by org.apache.hadoop.util.bloom.BloomFilter
All Implemented Interfaces:
Writable
Direct Known Subclasses:
RetouchedBloomFilter

public class BloomFilter
extends Filter

Implements a Bloom filter, as defined by Bloom in 1970.

The Bloom filter is a data structure that was introduced in 1970 and that has been adopted by the networking research community in the past decade thanks to the bandwidth efficiencies that it offers for the transmission of set membership information between networked hosts. A sender encodes the information into a bit vector, the Bloom filter, that is more compact than a conventional representation. Computation and space costs for construction are linear in the number of elements. The receiver uses the filter to test whether various elements are members of the set. Though the filter will occasionally return a false positive, it will never return a false negative. When creating the filter, the sender can choose its desired point in a trade-off between the false positive rate and the size.

Originally created by European Commission One-Lab Project 034819.

See Also:
The general behavior of a filter, Space/Time Trade-Offs in Hash Coding with Allowable Errors

Field Summary
 
Fields inherited from class org.apache.hadoop.util.bloom.Filter
hash, hashType, nbHash, vectorSize
 
Constructor Summary
BloomFilter()
          Default constructor - use with readFields
BloomFilter(int vectorSize, int nbHash, int hashType)
          Constructor
 
Method Summary
 void add(Key key)
          Adds a key to this filter.
 void and(Filter filter)
          Peforms a logical AND between this filter and a specified filter.
 int getVectorSize()
           
 boolean membershipTest(Key key)
          Determines wether a specified key belongs to this filter.
 void not()
          Performs a logical NOT on this filter.
 void or(Filter filter)
          Peforms a logical OR between this filter and a specified filter.
 void readFields(DataInput in)
          Deserialize the fields of this object from in.
 String toString()
           
 void write(DataOutput out)
          Serialize the fields of this object to out.
 void xor(Filter filter)
          Peforms a logical XOR between this filter and a specified filter.
 
Methods inherited from class org.apache.hadoop.util.bloom.Filter
add, add, add
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

BloomFilter

public BloomFilter()
Default constructor - use with readFields


BloomFilter

public BloomFilter(int vectorSize,
                   int nbHash,
                   int hashType)
Constructor

Parameters:
vectorSize - The vector size of this filter.
nbHash - The number of hash function to consider.
hashType - type of the hashing function (see Hash).
Method Detail

add

public void add(Key key)
Description copied from class: Filter
Adds a key to this filter.

Specified by:
add in class Filter
Parameters:
key - The key to add.

and

public void and(Filter filter)
Description copied from class: Filter
Peforms a logical AND between this filter and a specified filter.

Invariant: The result is assigned to this filter.

Specified by:
and in class Filter
Parameters:
filter - The filter to AND with.

membershipTest

public boolean membershipTest(Key key)
Description copied from class: Filter
Determines wether a specified key belongs to this filter.

Specified by:
membershipTest in class Filter
Parameters:
key - The key to test.
Returns:
boolean True if the specified key belongs to this filter. False otherwise.

not

public void not()
Description copied from class: Filter
Performs a logical NOT on this filter.

The result is assigned to this filter.

Specified by:
not in class Filter

or

public void or(Filter filter)
Description copied from class: Filter
Peforms a logical OR between this filter and a specified filter.

Invariant: The result is assigned to this filter.

Specified by:
or in class Filter
Parameters:
filter - The filter to OR with.

xor

public void xor(Filter filter)
Description copied from class: Filter
Peforms a logical XOR between this filter and a specified filter.

Invariant: The result is assigned to this filter.

Specified by:
xor in class Filter
Parameters:
filter - The filter to XOR with.

toString

public String toString()
Overrides:
toString in class Object

getVectorSize

public int getVectorSize()
Returns:
size of the the bloomfilter

write

public void write(DataOutput out)
           throws IOException
Description copied from interface: Writable
Serialize the fields of this object to out.

Specified by:
write in interface Writable
Overrides:
write in class Filter
Parameters:
out - DataOuput to serialize this object into.
Throws:
IOException

readFields

public void readFields(DataInput in)
                throws IOException
Description copied from interface: Writable
Deserialize the fields of this object from in.

For efficiency, implementations should attempt to re-use storage in the existing object where possible.

Specified by:
readFields in interface Writable
Overrides:
readFields in class Filter
Parameters:
in - DataInput to deseriablize this object from.
Throws:
IOException


Copyright © 2009 The Apache Software Foundation