Class FieldSelectionMapper<K,V>

  extended by org.apache.hadoop.mapreduce.Mapper<K,V,Text,Text>
      extended by org.apache.hadoop.mapreduce.lib.fieldsel.FieldSelectionMapper<K,V>

public class FieldSelectionMapper<K,V>
extends Mapper<K,V,Text,Text>

This class implements a mapper class that can be used to perform field selections in a manner similar to unix cut. The input data is treated as fields separated by a user specified separator (the default value is "\t"). The user can specify a list of fields that form the map output keys, and a list of fields that form the map output values. If the inputformat is TextInputFormat, the mapper will ignore the key to the map function. and the fields are from the value only. Otherwise, the fields are the union of those from the key and those from the value. The field separator is under attribute "" The map output field list spec is under attribute "". The value is expected to be like "keyFieldsSpec:valueFieldsSpec" key/valueFieldsSpec are comma (,) separated field spec: fieldSpec,fieldSpec,fieldSpec ... Each field spec can be a simple number (e.g. 5) specifying a specific field, or a range (like 2-5) to specify a range of fields, or an open range (like 3-) specifying all the fields starting from field 3. The open range field spec applies value fields only. They have no effect on the key fields. Here is an example: "4,3,0,1:6,5,1-3,7-". It specifies to use fields 4,3,0 and 1 for keys, and use fields 6,5,1,2,3,7 and above for values.

Field Summary
static org.apache.commons.logging.Log LOG
Constructor Summary
Method Summary
 void map(K key, V val, org.apache.hadoop.mapreduce.Mapper.Context context)
          The identify function.
 void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
          Called once at the beginning of the task.
Methods inherited from class org.apache.hadoop.mapreduce.Mapper
cleanup, run
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail


public static final org.apache.commons.logging.Log LOG
Constructor Detail


public FieldSelectionMapper()
Method Detail


public void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
           throws IOException,
Description copied from class: Mapper
Called once at the beginning of the task.

setup in class Mapper<K,V,Text,Text>


public void map(K key,
                V val,
                org.apache.hadoop.mapreduce.Mapper.Context context)
         throws IOException,
The identify function. Input key/value pair is written directly to output.

map in class Mapper<K,V,Text,Text>

Copyright © 2014 Apache Software Foundation. All Rights Reserved.