Package org.apache.hadoop.mapred.lib
Class RegexMapper<K>
java.lang.Object
org.apache.hadoop.mapred.MapReduceBase
org.apache.hadoop.mapred.lib.RegexMapper<K>
- All Implemented Interfaces:
Closeable,AutoCloseable,Closeable,JobConfigurable,Mapper<K,Text, Text, LongWritable>
@Public
@Stable
public class RegexMapper<K>
extends MapReduceBase
implements Mapper<K,Text,Text,LongWritable>
A
Mapper that extracts text matching a regular expression.-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidDefault implementation that does nothing.voidmap(K key, Text value, OutputCollector<Text, LongWritable> output, Reporter reporter) Maps a single input key/value pair into an intermediate key/value pair.Methods inherited from class org.apache.hadoop.mapred.MapReduceBase
close
-
Constructor Details
-
RegexMapper
public RegexMapper()
-
-
Method Details
-
configure
Description copied from class:MapReduceBaseDefault implementation that does nothing.- Specified by:
configurein interfaceJobConfigurable- Overrides:
configurein classMapReduceBase- Parameters:
job- the configuration
-
map
public void map(K key, Text value, OutputCollector<Text, LongWritable> output, Reporter reporter) throws IOExceptionDescription copied from interface:MapperMaps a single input key/value pair into an intermediate key/value pair.Output pairs need not be of the same types as input pairs. A given input pair may map to zero or many output pairs. Output pairs are collected with calls to
OutputCollector.collect(Object,Object).Applications can use the
Reporterprovided to report progress or just indicate that they are alive. In scenarios where the application takes significant amount of time to process individual key/value pairs, this is crucial since the framework might assume that the task has timed-out and kill that task. The other way of avoiding this is to set mapreduce.task.timeout to a high-enough value (or even zero for no time-outs).- Specified by:
mapin interfaceMapper<K,Text, Text, LongWritable> - Parameters:
key- the input key.value- the input value.output- collects mapped keys and values.reporter- facility to report progress.- Throws:
IOException
-