org.apache.hadoop.mapreduce.lib.aggregate
Class ValueAggregatorJob

java.lang.Object
  extended by org.apache.hadoop.mapreduce.lib.aggregate.ValueAggregatorJob

@InterfaceAudience.Public
@InterfaceStability.Stable
public class ValueAggregatorJob
extends Object

This is the main class for creating a map/reduce job using Aggregate framework. The Aggregate is a specialization of map/reduce framework, specializing for performing various simple aggregations. Generally speaking, in order to implement an application using Map/Reduce model, the developer is to implement Map and Reduce functions (and possibly combine function). However, a lot of applications related to counting and statistics computing have very similar characteristics. Aggregate abstracts out the general patterns of these functions and implementing those patterns. In particular, the package provides generic mapper/redducer/combiner classes, and a set of built-in value aggregators, and a generic utility class that helps user create map/reduce jobs using the generic class. The built-in aggregators include: sum over numeric values count the number of distinct values compute the histogram of values compute the minimum, maximum, media,average, standard deviation of numeric values The developer using Aggregate will need only to provide a plugin class conforming to the following interface: public interface ValueAggregatorDescriptor { public ArrayList generateKeyValPairs(Object key, Object value); public void configure(Configuration conf); } The package also provides a base class, ValueAggregatorBaseDescriptor, implementing the above interface. The user can extend the base class and implement generateKeyValPairs accordingly. The primary work of generateKeyValPairs is to emit one or more key/value pairs based on the input key/value pair. The key in an output key/value pair encode two pieces of information: aggregation type and aggregation id. The value will be aggregated onto the aggregation id according the aggregation type. This class offers a function to generate a map/reduce job using Aggregate framework. The function takes the following parameters: input directory spec input format (text or sequence file) output directory a file specifying the user plugin class


Constructor Summary
ValueAggregatorJob()
           
 
Method Summary
static Job createValueAggregatorJob(Configuration conf, String[] args)
          Create an Aggregate based map/reduce job.
static Job createValueAggregatorJob(String[] args, Class<? extends ValueAggregatorDescriptor>[] descriptors)
           
static JobControl createValueAggregatorJobs(String[] args)
           
static JobControl createValueAggregatorJobs(String[] args, Class<? extends ValueAggregatorDescriptor>[] descriptors)
           
static void main(String[] args)
          create and run an Aggregate based map/reduce job.
static Configuration setAggregatorDescriptors(Class<? extends ValueAggregatorDescriptor>[] descriptors)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ValueAggregatorJob

public ValueAggregatorJob()
Method Detail

createValueAggregatorJobs

public static JobControl createValueAggregatorJobs(String[] args,
                                                   Class<? extends ValueAggregatorDescriptor>[] descriptors)
                                            throws IOException
Throws:
IOException

createValueAggregatorJobs

public static JobControl createValueAggregatorJobs(String[] args)
                                            throws IOException
Throws:
IOException

createValueAggregatorJob

public static Job createValueAggregatorJob(Configuration conf,
                                           String[] args)
                                    throws IOException
Create an Aggregate based map/reduce job.

Parameters:
conf - The configuration for job
args - the arguments used for job creation. Generic hadoop arguments are accepted.
Returns:
a Job object ready for submission.
Throws:
IOException
See Also:
GenericOptionsParser

createValueAggregatorJob

public static Job createValueAggregatorJob(String[] args,
                                           Class<? extends ValueAggregatorDescriptor>[] descriptors)
                                    throws IOException
Throws:
IOException

setAggregatorDescriptors

public static Configuration setAggregatorDescriptors(Class<? extends ValueAggregatorDescriptor>[] descriptors)

main

public static void main(String[] args)
                 throws IOException,
                        InterruptedException,
                        ClassNotFoundException
create and run an Aggregate based map/reduce job.

Parameters:
args - the arguments used for job creation
Throws:
IOException
InterruptedException
ClassNotFoundException


Copyright © 2014 Apache Software Foundation. All Rights Reserved.