@InterfaceAudience.Public @InterfaceStability.Stable public class TokenCountMapper<K> extends MapReduceBase implements Mapper<K,Text,Text,LongWritable>
Mapper
that maps text values into <token,freq> pairs. Uses
StringTokenizer
to break text into tokens.Constructor and Description |
---|
TokenCountMapper() |
Modifier and Type | Method and Description |
---|---|
void |
map(K key,
Text value,
OutputCollector<Text,LongWritable> output,
Reporter reporter)
Maps a single input key/value pair into an intermediate key/value pair.
|
close, configure
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
configure
public void map(K key, Text value, OutputCollector<Text,LongWritable> output, Reporter reporter) throws IOException
Mapper
Output pairs need not be of the same types as input pairs. A given
input pair may map to zero or many output pairs. Output pairs are
collected with calls to
OutputCollector.collect(Object,Object)
.
Applications can use the Reporter
provided to report progress
or just indicate that they are alive. In scenarios where the application
takes significant amount of time to process individual key/value
pairs, this is crucial since the framework might assume that the task has
timed-out and kill that task. The other way of avoiding this is to set
mapreduce.task.timeout to a high-enough value (or even zero for no
time-outs).
map
in interface Mapper<K,Text,Text,LongWritable>
key
- the input key.value
- the input value.output
- collects mapped keys and values.reporter
- facility to report progress.IOException
Copyright © 2022 Apache Software Foundation. All rights reserved.