org.apache.hadoop.examples
Class SleepJob

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.examples.SleepJob
All Implemented Interfaces:
Closeable, Configurable, JobConfigurable, Mapper<IntWritable,IntWritable,IntWritable,NullWritable>, Partitioner<IntWritable,NullWritable>, Reducer<IntWritable,NullWritable,NullWritable,NullWritable>, Tool

public class SleepJob
extends Configured
implements Tool, Mapper<IntWritable,IntWritable,IntWritable,NullWritable>, Reducer<IntWritable,NullWritable,NullWritable,NullWritable>, Partitioner<IntWritable,NullWritable>

Dummy class for testing MR framefork. Sleeps for a defined period of time in mapper and reducer. Generates fake input for map / reduce jobs. Note that generated number of input pairs is in the order of numMappers * mapSleepTime / 100, so the job uses some disk space.


Nested Class Summary
static class SleepJob.EmptySplit
           
static class SleepJob.SleepInputFormat
           
 
Constructor Summary
SleepJob()
           
 
Method Summary
 void close()
           
 void configure(JobConf job)
          Initializes a new instance from a JobConf.
 int getPartition(IntWritable k, NullWritable v, int numPartitions)
          Get the paritition number for a given key (hence record) given the total number of partitions i.e.
static void main(String[] args)
           
 void map(IntWritable key, IntWritable value, OutputCollector<IntWritable,NullWritable> output, Reporter reporter)
          Maps a single input key/value pair into an intermediate key/value pair.
 void reduce(IntWritable key, Iterator<NullWritable> values, OutputCollector<NullWritable,NullWritable> output, Reporter reporter)
          Reduces values for a given key.
 int run(int numMapper, int numReducer, long mapSleepTime, int mapSleepCount, long reduceSleepTime, int reduceSleepCount)
           
 int run(String[] args)
          Execute the command with the given arguments.
 JobConf setupJobConf(int numMapper, int numReducer, long mapSleepTime, int mapSleepCount, long reduceSleepTime, int reduceSleepCount)
           
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Constructor Detail

SleepJob

public SleepJob()
Method Detail

getPartition

public int getPartition(IntWritable k,
                        NullWritable v,
                        int numPartitions)
Description copied from interface: Partitioner
Get the paritition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.

Typically a hash function on a all or a subset of the key.

Specified by:
getPartition in interface Partitioner<IntWritable,NullWritable>
Parameters:
k - the key to be paritioned.
v - the entry value.
numPartitions - the total number of partitions.
Returns:
the partition number for the key.

map

public void map(IntWritable key,
                IntWritable value,
                OutputCollector<IntWritable,NullWritable> output,
                Reporter reporter)
         throws IOException
Description copied from interface: Mapper
Maps a single input key/value pair into an intermediate key/value pair.

Output pairs need not be of the same types as input pairs. A given input pair may map to zero or many output pairs. Output pairs are collected with calls to OutputCollector.collect(Object,Object).

Applications can use the Reporter provided to report progress or just indicate that they are alive. In scenarios where the application takes an insignificant amount of time to process individual key/value pairs, this is crucial since the framework might assume that the task has timed-out and kill that task. The other way of avoiding this is to set mapred.task.timeout to a high-enough value (or even zero for no time-outs).

Specified by:
map in interface Mapper<IntWritable,IntWritable,IntWritable,NullWritable>
Parameters:
key - the input key.
value - the input value.
output - collects mapped keys and values.
reporter - facility to report progress.
Throws:
IOException

reduce

public void reduce(IntWritable key,
                   Iterator<NullWritable> values,
                   OutputCollector<NullWritable,NullWritable> output,
                   Reporter reporter)
            throws IOException
Description copied from interface: Reducer
Reduces values for a given key.

The framework calls this method for each <key, (list of values)> pair in the grouped inputs. Output values must be of the same type as input values. Input keys must not be altered. The framework will reuse the key and value objects that are passed into the reduce, therefore the application should clone the objects they want to keep a copy of. In many cases, all values are combined into zero or one value.

Output pairs are collected with calls to OutputCollector.collect(Object,Object).

Applications can use the Reporter provided to report progress or just indicate that they are alive. In scenarios where the application takes an insignificant amount of time to process individual key/value pairs, this is crucial since the framework might assume that the task has timed-out and kill that task. The other way of avoiding this is to set mapred.task.timeout to a high-enough value (or even zero for no time-outs).

Specified by:
reduce in interface Reducer<IntWritable,NullWritable,NullWritable,NullWritable>
Parameters:
key - the key.
values - the list of values to reduce.
output - to collect keys and combined values.
reporter - facility to report progress.
Throws:
IOException

configure

public void configure(JobConf job)
Description copied from interface: JobConfigurable
Initializes a new instance from a JobConf.

Specified by:
configure in interface JobConfigurable
Parameters:
job - the configuration

close

public void close()
           throws IOException
Specified by:
close in interface Closeable
Throws:
IOException

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception

run

public int run(int numMapper,
               int numReducer,
               long mapSleepTime,
               int mapSleepCount,
               long reduceSleepTime,
               int reduceSleepCount)
        throws IOException
Throws:
IOException

setupJobConf

public JobConf setupJobConf(int numMapper,
                            int numReducer,
                            long mapSleepTime,
                            int mapSleepCount,
                            long reduceSleepTime,
                            int reduceSleepCount)

run

public int run(String[] args)
        throws Exception
Description copied from interface: Tool
Execute the command with the given arguments.

Specified by:
run in interface Tool
Parameters:
args - command specific arguments.
Returns:
exit code.
Throws:
Exception


Copyright © 2009 The Apache Software Foundation