@InterfaceAudience.Public @InterfaceStability.Stable public class MultithreadedMapper<K1,V1,K2,V2> extends Mapper<K1,V1,K2,V2>
It can be used instead of the default implementation,
MapRunner, when the Map operation is not CPU
bound in order to improve throughput.
Mapper implementations using this MapRunnable must be thread-safe.
The Map-Reduce job has to be configured with the mapper to use via
setMapperClass(Job, Class) and
the number of thread the thread-pool can use with the
getNumberOfThreads(JobContext) method. The default
value is 10 threads.
| Modifier and Type | Field and Description |
|---|---|
static String |
MAP_CLASS |
static String |
NUM_THREADS |
| Constructor and Description |
|---|
MultithreadedMapper() |
| Modifier and Type | Method and Description |
|---|---|
static <K1,V1,K2,V2> |
getMapperClass(JobContext job)
Get the application's mapper class.
|
static int |
getNumberOfThreads(JobContext job)
The number of threads in the thread pool that will run the map function.
|
void |
run(org.apache.hadoop.mapreduce.Mapper.Context context)
Run the application's maps using a thread pool.
|
static <K1,V1,K2,V2> |
setMapperClass(Job job,
Class<? extends Mapper<K1,V1,K2,V2>> cls)
Set the application's mapper class.
|
static void |
setNumberOfThreads(Job job,
int threads)
Set the number of threads in the pool for running maps.
|
public static String NUM_THREADS
public MultithreadedMapper()
public static int getNumberOfThreads(JobContext job)
job - the jobpublic static void setNumberOfThreads(Job job, int threads)
job - the job to modifythreads - the new number of threadspublic static <K1,V1,K2,V2> Class<Mapper<K1,V1,K2,V2>> getMapperClass(JobContext job)
K1 - the map's input key typeV1 - the map's input value typeK2 - the map's output key typeV2 - the map's output value typejob - the jobpublic static <K1,V1,K2,V2> void setMapperClass(Job job, Class<? extends Mapper<K1,V1,K2,V2>> cls)
K1 - the map input key typeV1 - the map input value typeK2 - the map output key typeV2 - the map output value typejob - the job to modifycls - the class to use as the mapperpublic void run(org.apache.hadoop.mapreduce.Mapper.Context context) throws IOException, InterruptedException
run in class Mapper<K1,V1,K2,V2>IOExceptionInterruptedExceptionCopyright © 2016 Apache Software Foundation. All rights reserved.