@InterfaceAudience.Public @InterfaceStability.Stable public class RegexMapper<K> extends MapReduceBase implements Mapper<K,Text,Text,LongWritable>
Mapper
that extracts text matching a regular expression.Constructor and Description |
---|
RegexMapper() |
Modifier and Type | Method and Description |
---|---|
void |
configure(JobConf job)
Default implementation that does nothing.
|
void |
map(K key,
Text value,
OutputCollector<Text,LongWritable> output,
Reporter reporter)
Maps a single input key/value pair into an intermediate key/value pair.
|
close
public RegexMapper()
public void configure(JobConf job)
MapReduceBase
configure
in interface JobConfigurable
configure
in class MapReduceBase
job
- the configurationpublic void map(K key, Text value, OutputCollector<Text,LongWritable> output, Reporter reporter) throws IOException
Mapper
Output pairs need not be of the same types as input pairs. A given
input pair may map to zero or many output pairs. Output pairs are
collected with calls to
OutputCollector.collect(Object,Object)
.
Applications can use the Reporter
provided to report progress
or just indicate that they are alive. In scenarios where the application
takes significant amount of time to process individual key/value
pairs, this is crucial since the framework might assume that the task has
timed-out and kill that task. The other way of avoiding this is to set
mapreduce.task.timeout to a high-enough value (or even zero for no
time-outs).
map
in interface Mapper<K,Text,Text,LongWritable>
key
- the input key.value
- the input value.output
- collects mapped keys and values.reporter
- facility to report progress.IOException
Copyright © 2018 Apache Software Foundation. All rights reserved.