org.apache.hadoop.mapreduce.lib.partition
Class BinaryPartitioner<V>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.Partitioner<BinaryComparable,V>
      extended by org.apache.hadoop.mapreduce.lib.partition.BinaryPartitioner<V>
All Implemented Interfaces:
Configurable
Direct Known Subclasses:
BinaryPartitioner

@InterfaceAudience.Public
@InterfaceStability.Evolving
public class BinaryPartitioner<V>
extends Partitioner<BinaryComparable,V>
implements Configurable

Partition BinaryComparable keys using a configurable part of the bytes array returned by BinaryComparable.getBytes().

The subarray to be used for the partitioning can be defined by means of the following properties:

Like in Python, both negative and positive offsets are allowed, but the meaning is slightly different. In case of an array of length 5, for instance, the possible offsets are:

  +---+---+---+---+---+
  | B | B | B | B | B |
  +---+---+---+---+---+
    0   1   2   3   4
   -5  -4  -3  -2  -1
 
The first row of numbers gives the position of the offsets 0...5 in the array; the second row gives the corresponding negative offsets. Contrary to Python, the specified subarray has byte i and j as first and last element, repectively, when i and j are the left and right offset.

For Hadoop programs written in Java, it is advisable to use one of the following static convenience methods for setting the offsets:


Constructor Summary
BinaryPartitioner()
           
 
Method Summary
 Configuration getConf()
          Return the configuration used by this object.
 int getPartition(BinaryComparable key, V value, int numPartitions)
          Use (the specified slice of the array returned by) BinaryComparable.getBytes() to partition.
 void setConf(Configuration conf)
          Set the configuration to be used by this object.
static void setLeftOffset(Configuration conf, int offset)
          Set the subarray to be used for partitioning to bytes[offset:] in Python syntax.
static void setOffsets(Configuration conf, int left, int right)
          Set the subarray to be used for partitioning to bytes[left:(right+1)] in Python syntax.
static void setRightOffset(Configuration conf, int offset)
          Set the subarray to be used for partitioning to bytes[:(offset+1)] in Python syntax.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BinaryPartitioner

public BinaryPartitioner()
Method Detail

setOffsets

public static void setOffsets(Configuration conf,
                              int left,
                              int right)
Set the subarray to be used for partitioning to bytes[left:(right+1)] in Python syntax.

Parameters:
conf - configuration object
left - left Python-style offset
right - right Python-style offset

setLeftOffset

public static void setLeftOffset(Configuration conf,
                                 int offset)
Set the subarray to be used for partitioning to bytes[offset:] in Python syntax.

Parameters:
conf - configuration object
offset - left Python-style offset

setRightOffset

public static void setRightOffset(Configuration conf,
                                  int offset)
Set the subarray to be used for partitioning to bytes[:(offset+1)] in Python syntax.

Parameters:
conf - configuration object
offset - right Python-style offset

setConf

public void setConf(Configuration conf)
Description copied from interface: Configurable
Set the configuration to be used by this object.

Specified by:
setConf in interface Configurable

getConf

public Configuration getConf()
Description copied from interface: Configurable
Return the configuration used by this object.

Specified by:
getConf in interface Configurable

getPartition

public int getPartition(BinaryComparable key,
                        V value,
                        int numPartitions)
Use (the specified slice of the array returned by) BinaryComparable.getBytes() to partition.

Specified by:
getPartition in class Partitioner<BinaryComparable,V>
Parameters:
key - the key to be partioned.
value - the entry value.
numPartitions - the total number of partitions.
Returns:
the partition number for the key.


Copyright © 2009 The Apache Software Foundation