org.apache.hadoop.mapreduce
Class Partitioner<KEY,VALUE>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.Partitioner<KEY,VALUE>
Direct Known Subclasses:
BinaryPartitioner, HashPartitioner, KeyFieldBasedPartitioner, SecondarySort.FirstPartitioner, TotalOrderPartitioner

public abstract class Partitioner<KEY,VALUE>
extends Object

Partitions the key space.

Partitioner controls the partitioning of the keys of the intermediate map-outputs. The key (or a subset of the key) is used to derive the partition, typically by a hash function. The total number of partitions is the same as the number of reduce tasks for the job. Hence this controls which of the m reduce tasks the intermediate key (and hence the record) is sent for reduction.

See Also:
Reducer

Constructor Summary
Partitioner()
           
 
Method Summary
abstract  int getPartition(KEY key, VALUE value, int numPartitions)
          Get the partition number for a given key (hence record) given the total number of partitions i.e.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Partitioner

public Partitioner()
Method Detail

getPartition

public abstract int getPartition(KEY key,
                                 VALUE value,
                                 int numPartitions)
Get the partition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.

Typically a hash function on a all or a subset of the key.

Parameters:
key - the key to be partioned.
value - the entry value.
numPartitions - the total number of partitions.
Returns:
the partition number for the key.


Copyright © 2009 The Apache Software Foundation