Package org.apache.hadoop.mapreduce
Class Partitioner<KEY,VALUE>
java.lang.Object
org.apache.hadoop.mapreduce.Partitioner<KEY,VALUE>
- Direct Known Subclasses:
BinaryPartitioner,HashPartitioner,KeyFieldBasedPartitioner,RehashPartitioner,TotalOrderPartitioner
Partitions the key space.
Partitioner controls the partitioning of the keys of the
intermediate map-outputs. The key (or a subset of the key) is used to derive
the partition, typically by a hash function. The total number of partitions
is the same as the number of reduce tasks for the job. Hence this controls
which of the m reduce tasks the intermediate key (and hence the
record) is sent for reduction.
Note: A Partitioner is created only when there are multiple
reducers.
Note: If you require your Partitioner class to obtain the Job's
configuration object, implement the Configurable interface.
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionabstract intgetPartition(KEY key, VALUE value, int numPartitions) Get the partition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.
-
Constructor Details
-
Partitioner
public Partitioner()
-
-
Method Details
-
getPartition
Get the partition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.Typically a hash function on a all or a subset of the key.
- Parameters:
key- the key to be partioned.value- the entry value.numPartitions- the total number of partitions.- Returns:
- the partition number for the
key.
-