org.apache.hadoop.mapreduce.lib.aggregate
Class UniqValueCount

java.lang.Object
  extended by org.apache.hadoop.mapreduce.lib.aggregate.UniqValueCount
All Implemented Interfaces:
ValueAggregator<Object>
Direct Known Subclasses:
UniqValueCount

@InterfaceAudience.Public
@InterfaceStability.Stable
public class UniqValueCount
extends Object
implements ValueAggregator<Object>

This class implements a value aggregator that dedupes a sequence of objects.


Field Summary
static String MAX_NUM_UNIQUE_VALUES
           
 
Constructor Summary
UniqValueCount()
          the default constructor
UniqValueCount(long maxNum)
          constructor
 
Method Summary
 void addNextValue(Object val)
          add a value to the aggregator
 ArrayList<Object> getCombinerOutput()
           
 String getReport()
           
 Set<Object> getUniqueItems()
           
 void reset()
          reset the aggregator
 long setMaxItems(long n)
          Set the limit on the number of unique values
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MAX_NUM_UNIQUE_VALUES

public static final String MAX_NUM_UNIQUE_VALUES
See Also:
Constant Field Values
Constructor Detail

UniqValueCount

public UniqValueCount()
the default constructor


UniqValueCount

public UniqValueCount(long maxNum)
constructor

Parameters:
maxNum - the limit in the number of unique values to keep.
Method Detail

setMaxItems

public long setMaxItems(long n)
Set the limit on the number of unique values

Parameters:
n - the desired limit on the number of unique values
Returns:
the new limit on the number of unique values

addNextValue

public void addNextValue(Object val)
add a value to the aggregator

Specified by:
addNextValue in interface ValueAggregator<Object>
Parameters:
val - an object.

getReport

public String getReport()
Specified by:
getReport in interface ValueAggregator<Object>
Returns:
return the number of unique objects aggregated

getUniqueItems

public Set<Object> getUniqueItems()
Returns:
the set of the unique objects

reset

public void reset()
reset the aggregator

Specified by:
reset in interface ValueAggregator<Object>

getCombinerOutput

public ArrayList<Object> getCombinerOutput()
Specified by:
getCombinerOutput in interface ValueAggregator<Object>
Returns:
return an array of the unique objects. The return value is expected to be used by the a combiner.


Copyright © 2014 Apache Software Foundation. All Rights Reserved.