Class FieldSelectionMapper<K,V>
java.lang.Object
org.apache.hadoop.mapreduce.Mapper<K,V,Text,Text>
org.apache.hadoop.mapreduce.lib.fieldsel.FieldSelectionMapper<K,V>
This class implements a mapper class that can be used to perform
field selections in a manner similar to unix cut. The input data is treated
as fields separated by a user specified separator (the default value is
"\t"). The user can specify a list of fields that form the map output keys,
and a list of fields that form the map output values. If the inputformat is
TextInputFormat, the mapper will ignore the key to the map function. and the
fields are from the value only. Otherwise, the fields are the union of those
from the key and those from the value.
The field separator is under attribute "mapreduce.fieldsel.data.field.separator"
The map output field list spec is under attribute
"mapreduce.fieldsel.map.output.key.value.fields.spec".
The value is expected to be like
"keyFieldsSpec:valueFieldsSpec" key/valueFieldsSpec are comma (,) separated
field spec: fieldSpec,fieldSpec,fieldSpec ... Each field spec can be a
simple number (e.g. 5) specifying a specific field, or a range (like 2-5)
to specify a range of fields, or an open range (like 3-) specifying all
the fields starting from field 3. The open range field spec applies value
fields only. They have no effect on the key fields.
Here is an example: "4,3,0,1:6,5,1-3,7-". It specifies to use fields
4,3,0 and 1 for keys, and use fields 6,5,1,2,3,7 and above for values.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
org.apache.hadoop.mapreduce.Mapper.Context -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
-
Field Details
-
LOG
public static final org.slf4j.Logger LOG
-
-
Constructor Details
-
FieldSelectionMapper
public FieldSelectionMapper()
-
-
Method Details
-
setup
public void setup(Mapper<K, V, throws IOException, InterruptedExceptionText, Text>.org.apache.hadoop.mapreduce.Mapper.Context context) Description copied from class:MapperCalled once at the beginning of the task.- Overrides:
setupin classMapper<K,V, Text, Text> - Throws:
IOExceptionInterruptedException
-
map
public void map(K key, V val, Mapper<K, V, throws IOException, InterruptedExceptionText, Text>.org.apache.hadoop.mapreduce.Mapper.Context context) The identify function. Input key/value pair is written directly to output.- Overrides:
mapin classMapper<K,V, Text, Text> - Throws:
IOExceptionInterruptedException
-