org.apache.hadoop.mapred.lib.aggregate
Class ValueAggregatorJob

java.lang.Object
  extended by org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob

@InterfaceAudience.Public
@InterfaceStability.Stable
public class ValueAggregatorJob
extends Object

This is the main class for creating a map/reduce job using Aggregate framework. The Aggregate is a specialization of map/reduce framework, specilizing for performing various simple aggregations. Generally speaking, in order to implement an application using Map/Reduce model, the developer is to implement Map and Reduce functions (and possibly combine function). However, a lot of applications related to counting and statistics computing have very similar characteristics. Aggregate abstracts out the general patterns of these functions and implementing those patterns. In particular, the package provides generic mapper/redducer/combiner classes, and a set of built-in value aggregators, and a generic utility class that helps user create map/reduce jobs using the generic class. The built-in aggregators include: sum over numeric values count the number of distinct values compute the histogram of values compute the minimum, maximum, media,average, standard deviation of numeric values The developer using Aggregate will need only to provide a plugin class conforming to the following interface: public interface ValueAggregatorDescriptor { public ArrayList generateKeyValPairs(Object key, Object value); public void configure(JobConfjob); } The package also provides a base class, ValueAggregatorBaseDescriptor, implementing the above interface. The user can extend the base class and implement generateKeyValPairs accordingly. The primary work of generateKeyValPairs is to emit one or more key/value pairs based on the input key/value pair. The key in an output key/value pair encode two pieces of information: aggregation type and aggregation id. The value will be aggregated onto the aggregation id according the aggregation type. This class offers a function to generate a map/reduce job using Aggregate framework. The function takes the following parameters: input directory spec input format (text or sequence file) output directory a file specifying the user plugin class


Constructor Summary
ValueAggregatorJob()
           
 
Method Summary
static JobConf createValueAggregatorJob(String[] args)
          Create an Aggregate based map/reduce job.
static JobConf createValueAggregatorJob(String[] args, Class<?> caller)
          Create an Aggregate based map/reduce job.
static JobConf createValueAggregatorJob(String[] args, Class<? extends ValueAggregatorDescriptor>[] descriptors)
           
static JobConf createValueAggregatorJob(String[] args, Class<? extends ValueAggregatorDescriptor>[] descriptors, Class<?> caller)
           
static JobControl createValueAggregatorJobs(String[] args)
           
static JobControl createValueAggregatorJobs(String[] args, Class<? extends ValueAggregatorDescriptor>[] descriptors)
           
static void main(String[] args)
          create and run an Aggregate based map/reduce job.
static void setAggregatorDescriptors(JobConf job, Class<? extends ValueAggregatorDescriptor>[] descriptors)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ValueAggregatorJob

public ValueAggregatorJob()
Method Detail

createValueAggregatorJobs

public static JobControl createValueAggregatorJobs(String[] args,
                                                   Class<? extends ValueAggregatorDescriptor>[] descriptors)
                                            throws IOException
Throws:
IOException

createValueAggregatorJobs

public static JobControl createValueAggregatorJobs(String[] args)
                                            throws IOException
Throws:
IOException

createValueAggregatorJob

public static JobConf createValueAggregatorJob(String[] args,
                                               Class<?> caller)
                                        throws IOException
Create an Aggregate based map/reduce job.

Parameters:
args - the arguments used for job creation. Generic hadoop arguments are accepted.
caller - the the caller class.
Returns:
a JobConf object ready for submission.
Throws:
IOException
See Also:
GenericOptionsParser

createValueAggregatorJob

public static JobConf createValueAggregatorJob(String[] args)
                                        throws IOException
Create an Aggregate based map/reduce job.

Parameters:
args - the arguments used for job creation. Generic hadoop arguments are accepted.
Returns:
a JobConf object ready for submission.
Throws:
IOException
See Also:
GenericOptionsParser

createValueAggregatorJob

public static JobConf createValueAggregatorJob(String[] args,
                                               Class<? extends ValueAggregatorDescriptor>[] descriptors)
                                        throws IOException
Throws:
IOException

setAggregatorDescriptors

public static void setAggregatorDescriptors(JobConf job,
                                            Class<? extends ValueAggregatorDescriptor>[] descriptors)

createValueAggregatorJob

public static JobConf createValueAggregatorJob(String[] args,
                                               Class<? extends ValueAggregatorDescriptor>[] descriptors,
                                               Class<?> caller)
                                        throws IOException
Throws:
IOException

main

public static void main(String[] args)
                 throws IOException
create and run an Aggregate based map/reduce job.

Parameters:
args - the arguments used for job creation
Throws:
IOException


Copyright © 2014 Apache Software Foundation. All Rights Reserved.