Interface Partitioner<K2,V2>

All Superinterfaces:
JobConfigurable
All Known Implementing Classes:
BinaryPartitioner, HashPartitioner, KeyFieldBasedPartitioner, TotalOrderPartitioner

@Public @Stable public interface Partitioner<K2,V2> extends JobConfigurable
Partitions the key space.

Partitioner controls the partitioning of the keys of the intermediate map-outputs. The key (or a subset of the key) is used to derive the partition, typically by a hash function. The total number of partitions is the same as the number of reduce tasks for the job. Hence this controls which of the m reduce tasks the intermediate key (and hence the record) is sent for reduction.

Note: A Partitioner is created only when there are multiple reducers.

See Also:
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    getPartition(K2 key, V2 value, int numPartitions)
    Get the paritition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.

    Methods inherited from interface org.apache.hadoop.mapred.JobConfigurable

    configure
  • Method Details

    • getPartition

      int getPartition(K2 key, V2 value, int numPartitions)
      Get the paritition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.

      Typically a hash function on a all or a subset of the key.

      Parameters:
      key - the key to be paritioned.
      value - the entry value.
      numPartitions - the total number of partitions.
      Returns:
      the partition number for the key.