org.apache.hadoop.mapreduce.lib.fieldsel
Class FieldSelectionReducer<K,V>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.Reducer<Text,Text,Text,Text>
      extended by org.apache.hadoop.mapreduce.lib.fieldsel.FieldSelectionReducer<K,V>

@InterfaceAudience.Public
@InterfaceStability.Stable
public class FieldSelectionReducer<K,V>
extends Reducer<Text,Text,Text,Text>

This class implements a reducer class that can be used to perform field selections in a manner similar to unix cut. The input data is treated as fields separated by a user specified separator (the default value is "\t"). The user can specify a list of fields that form the reduce output keys, and a list of fields that form the reduce output values. The fields are the union of those from the key and those from the value. The field separator is under attribute "mapreduce.fieldsel.data.field.separator" The reduce output field list spec is under attribute "mapreduce.fieldsel.reduce.output.key.value.fields.spec". The value is expected to be like "keyFieldsSpec:valueFieldsSpec" key/valueFieldsSpec are comma (,) separated field spec: fieldSpec,fieldSpec,fieldSpec ... Each field spec can be a simple number (e.g. 5) specifying a specific field, or a range (like 2-5) to specify a range of fields, or an open range (like 3-) specifying all the fields starting from field 3. The open range field spec applies value fields only. They have no effect on the key fields. Here is an example: "4,3,0,1:6,5,1-3,7-". It specifies to use fields 4,3,0 and 1 for keys, and use fields 6,5,1,2,3,7 and above for values.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Reducer
Reducer.Context
 
Field Summary
static org.apache.commons.logging.Log LOG
           
 
Constructor Summary
FieldSelectionReducer()
           
 
Method Summary
 void reduce(Text key, Iterable<Text> values, Reducer.Context context)
          This method is called once for each key.
 void setup(Reducer.Context context)
          Called once at the start of the task.
 
Methods inherited from class org.apache.hadoop.mapreduce.Reducer
cleanup, run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG
Constructor Detail

FieldSelectionReducer

public FieldSelectionReducer()
Method Detail

setup

public void setup(Reducer.Context context)
           throws IOException,
                  InterruptedException
Description copied from class: Reducer
Called once at the start of the task.

Overrides:
setup in class Reducer<Text,Text,Text,Text>
Throws:
IOException
InterruptedException

reduce

public void reduce(Text key,
                   Iterable<Text> values,
                   Reducer.Context context)
            throws IOException,
                   InterruptedException
Description copied from class: Reducer
This method is called once for each key. Most applications will define their reduce class by overriding this method. The default implementation is an identity function.

Overrides:
reduce in class Reducer<Text,Text,Text,Text>
Throws:
IOException
InterruptedException


Copyright © 2009 The Apache Software Foundation