org.apache.hadoop.mapred.lib
Class KeyFieldBasedPartitioner<K2,V2>
java.lang.Object
org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner<K2,V2>
- All Implemented Interfaces:
- JobConfigurable, Partitioner<K2,V2>
public class KeyFieldBasedPartitioner<K2,V2>
- extends Object
- implements Partitioner<K2,V2>
Defines a way to partition keys based on certain key fields (also see
KeyFieldBasedComparator
.
The key specification supported is of the form -k pos1[,pos2], where,
pos is of the form f[.c][opts], where f is the number
of the key field to use, and c is the number of the first character from
the beginning of the field. Fields and character posns are numbered
starting with 1; a character position of zero in pos2 indicates the
field's last character. If '.c' is omitted from pos1, it defaults to 1
(the beginning of the field); if omitted from pos2, it defaults to 0
(the end of the field).
Method Summary |
void |
configure(JobConf job)
Initializes a new instance from a JobConf . |
protected int |
getPartition(int hash,
int numReduceTasks)
|
int |
getPartition(K2 key,
V2 value,
int numReduceTasks)
Get the paritition number for a given key (hence record) given the total
number of partitions i.e. |
protected int |
hashCode(byte[] b,
int start,
int end,
int currentHash)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
KeyFieldBasedPartitioner
public KeyFieldBasedPartitioner()
configure
public void configure(JobConf job)
- Description copied from interface:
JobConfigurable
- Initializes a new instance from a
JobConf
.
- Specified by:
configure
in interface JobConfigurable
- Parameters:
job
- the configuration
getPartition
public int getPartition(K2 key,
V2 value,
int numReduceTasks)
- Description copied from interface:
Partitioner
- Get the paritition number for a given key (hence record) given the total
number of partitions i.e. number of reduce-tasks for the job.
Typically a hash function on a all or a subset of the key.
- Specified by:
getPartition
in interface Partitioner<K2,V2>
- Parameters:
key
- the key to be paritioned.value
- the entry value.numReduceTasks
- the total number of partitions.
- Returns:
- the partition number for the
key
.
hashCode
protected int hashCode(byte[] b,
int start,
int end,
int currentHash)
getPartition
protected int getPartition(int hash,
int numReduceTasks)
Copyright © 2009 The Apache Software Foundation