Class KeyFieldBasedPartitioner<K2,V2>

java.lang.Object
org.apache.hadoop.mapreduce.Partitioner<K2,V2>
org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedPartitioner<K2,V2>
All Implemented Interfaces:
Configurable
Direct Known Subclasses:
KeyFieldBasedPartitioner

@Public @Stable public class KeyFieldBasedPartitioner<K2,V2> extends Partitioner<K2,V2> implements Configurable
Defines a way to partition keys based on certain key fields (also see KeyFieldBasedComparator. The key specification supported is of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field).
  • Field Details

    • PARTITIONER_OPTIONS

      public static String PARTITIONER_OPTIONS
  • Constructor Details

    • KeyFieldBasedPartitioner

      public KeyFieldBasedPartitioner()
  • Method Details

    • setConf

      public void setConf(Configuration conf)
      Description copied from interface: Configurable
      Set the configuration to be used by this object.
      Specified by:
      setConf in interface Configurable
      Parameters:
      conf - configuration to be used
    • getConf

      public Configuration getConf()
      Description copied from interface: Configurable
      Return the configuration used by this object.
      Specified by:
      getConf in interface Configurable
      Returns:
      Configuration
    • getPartition

      public int getPartition(K2 key, V2 value, int numReduceTasks)
      Description copied from class: Partitioner
      Get the partition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.

      Typically a hash function on a all or a subset of the key.

      Specified by:
      getPartition in class Partitioner<K2,V2>
      Parameters:
      key - the key to be partioned.
      value - the entry value.
      numReduceTasks - the total number of partitions.
      Returns:
      the partition number for the key.
    • hashCode

      protected int hashCode(byte[] b, int start, int end, int currentHash)
    • getPartition

      protected int getPartition(int hash, int numReduceTasks)
    • setKeyFieldPartitionerOptions

      public void setKeyFieldPartitionerOptions(Job job, String keySpec)
      Set the KeyFieldBasedPartitioner options used for Partitioner
      Parameters:
      keySpec - the key specification of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field).
    • getKeyFieldPartitionerOption

      public String getKeyFieldPartitionerOption(JobContext job)
      Get the KeyFieldBasedPartitioner options