@InterfaceAudience.Public @InterfaceStability.Stable public class MultithreadedMapper<K1,V1,K2,V2> extends Mapper<K1,V1,K2,V2>
It can be used instead of the default implementation,
MapRunner
, when the Map operation is not CPU
bound in order to improve throughput.
Mapper implementations using this MapRunnable must be thread-safe.
The Map-Reduce job has to be configured with the mapper to use via
setMapperClass(Job, Class)
and
the number of thread the thread-pool can use with the
getNumberOfThreads(JobContext)
method. The default
value is 10 threads.
Modifier and Type | Field and Description |
---|---|
static String |
MAP_CLASS |
static String |
NUM_THREADS |
Constructor and Description |
---|
MultithreadedMapper() |
Modifier and Type | Method and Description |
---|---|
static <K1,V1,K2,V2> |
getMapperClass(JobContext job)
Get the application's mapper class.
|
static int |
getNumberOfThreads(JobContext job)
The number of threads in the thread pool that will run the map function.
|
void |
run(org.apache.hadoop.mapreduce.Mapper.Context context)
Run the application's maps using a thread pool.
|
static <K1,V1,K2,V2> |
setMapperClass(Job job,
Class<? extends Mapper<K1,V1,K2,V2>> cls)
Set the application's mapper class.
|
static void |
setNumberOfThreads(Job job,
int threads)
Set the number of threads in the pool for running maps.
|
public static String NUM_THREADS
public MultithreadedMapper()
public static int getNumberOfThreads(JobContext job)
job
- the jobpublic static void setNumberOfThreads(Job job, int threads)
job
- the job to modifythreads
- the new number of threadspublic static <K1,V1,K2,V2> Class<Mapper<K1,V1,K2,V2>> getMapperClass(JobContext job)
K1
- the map's input key typeV1
- the map's input value typeK2
- the map's output key typeV2
- the map's output value typejob
- the jobpublic static <K1,V1,K2,V2> void setMapperClass(Job job, Class<? extends Mapper<K1,V1,K2,V2>> cls)
K1
- the map input key typeV1
- the map input value typeK2
- the map output key typeV2
- the map output value typejob
- the job to modifycls
- the class to use as the mapperpublic void run(org.apache.hadoop.mapreduce.Mapper.Context context) throws IOException, InterruptedException
run
in class Mapper<K1,V1,K2,V2>
IOException
InterruptedException
Copyright © 2018 Apache Software Foundation. All rights reserved.