org.apache.hadoop.mapreduce.lib.partition
Class KeyFieldBasedComparator<K,V>

java.lang.Object
  extended by org.apache.hadoop.io.WritableComparator
      extended by org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedComparator<K,V>
All Implemented Interfaces:
Comparator, Configurable, RawComparator

@InterfaceAudience.Public
@InterfaceStability.Stable
public class KeyFieldBasedComparator<K,V>
extends WritableComparator
implements Configurable

This comparator implementation provides a subset of the features provided by the Unix/GNU Sort. In particular, the supported features are: -n, (Sort numerically) -r, (Reverse the result of comparison) -k pos1[,pos2], where pos is of the form f[.c][opts], where f is the number of the field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field). opts are ordering options (any of 'nr' as described above). We assume that the fields in the key are separated by mapreduce.map.output.key.field.separator.


Field Summary
static String COMPARATOR_OPTIONS
           
 
Constructor Summary
KeyFieldBasedComparator()
           
 
Method Summary
 int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2)
          Optimization hook.
 Configuration getConf()
          Return the configuration used by this object.
static String getKeyFieldComparatorOption(JobContext job)
          Get the KeyFieldBasedComparator options
 void setConf(Configuration conf)
          Set the configuration to be used by this object.
static void setKeyFieldComparatorOptions(Job job, String keySpec)
          Set the KeyFieldBasedComparator options used to compare keys.
 
Methods inherited from class org.apache.hadoop.io.WritableComparator
compare, compare, compareBytes, define, get, getKeyClass, hashBytes, hashBytes, newKey, readDouble, readFloat, readInt, readLong, readUnsignedShort, readVInt, readVLong
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.util.Comparator
equals
 

Field Detail

COMPARATOR_OPTIONS

public static String COMPARATOR_OPTIONS
Constructor Detail

KeyFieldBasedComparator

public KeyFieldBasedComparator()
Method Detail

setConf

public void setConf(Configuration conf)
Description copied from interface: Configurable
Set the configuration to be used by this object.

Specified by:
setConf in interface Configurable

getConf

public Configuration getConf()
Description copied from interface: Configurable
Return the configuration used by this object.

Specified by:
getConf in interface Configurable

compare

public int compare(byte[] b1,
                   int s1,
                   int l1,
                   byte[] b2,
                   int s2,
                   int l2)
Description copied from class: WritableComparator
Optimization hook. Override this to make SequenceFile.Sorter's scream.

The default implementation reads the data into two WritableComparables (using Writable.readFields(DataInput), then calls WritableComparator.compare(WritableComparable,WritableComparable).

Specified by:
compare in interface RawComparator
Overrides:
compare in class WritableComparator

setKeyFieldComparatorOptions

public static void setKeyFieldComparatorOptions(Job job,
                                                String keySpec)
Set the KeyFieldBasedComparator options used to compare keys.

Parameters:
keySpec - the key specification of the form -k pos1[,pos2], where, pos is of the form f[.c][opts], where f is the number of the key field to use, and c is the number of the first character from the beginning of the field. Fields and character posns are numbered starting with 1; a character position of zero in pos2 indicates the field's last character. If '.c' is omitted from pos1, it defaults to 1 (the beginning of the field); if omitted from pos2, it defaults to 0 (the end of the field). opts are ordering options. The supported options are: -n, (Sort numerically) -r, (Reverse the result of comparison)

getKeyFieldComparatorOption

public static String getKeyFieldComparatorOption(JobContext job)
Get the KeyFieldBasedComparator options



Copyright © 2009 The Apache Software Foundation