org.apache.hadoop.mapreduce
Class Job

java.lang.Object
  extended by org.apache.hadoop.mapreduce.JobContext
      extended by org.apache.hadoop.mapreduce.Job

public class Job
extends JobContext

The job submitter's view of the Job. It allows the user to configure the job, submit it, control its execution, and query the state. The set methods only work until the job is submitted, afterwards they will throw an IllegalStateException.


Nested Class Summary
static class Job.JobState
           
 
Field Summary
 
Fields inherited from class org.apache.hadoop.mapreduce.JobContext
CACHE_ARCHIVES_VISIBILITIES, CACHE_FILE_VISIBILITIES, COMBINE_CLASS_ATTR, conf, credentials, INPUT_FORMAT_CLASS_ATTR, JOB_ACL_MODIFY_JOB, JOB_ACL_VIEW_JOB, JOB_CANCEL_DELEGATION_TOKEN, JOB_NAMENODES, MAP_CLASS_ATTR, OUTPUT_FORMAT_CLASS_ATTR, PARTITIONER_CLASS_ATTR, REDUCE_CLASS_ATTR, ugi, USER_LOG_RETAIN_HOURS
 
Constructor Summary
Job()
           
Job(Configuration conf)
           
Job(Configuration conf, String jobName)
           
 
Method Summary
 void failTask(TaskAttemptID taskId)
          Fail indicated task attempt.
 Counters getCounters()
          Gets the counters for this job.
static Job getInstance()
          Creates a new Job A Job will be created with a generic Configuration.
static Job getInstance(Configuration conf)
          Creates a new Job with a given Configuration.
static Job getInstance(Configuration conf, String jobName)
          Creates a new Job with a given Configuration and a given jobName.
 String getJar()
          Get the pathname of the job's jar.
 TaskCompletionEvent[] getTaskCompletionEvents(int startFrom)
          Get events indicating completion (success/failure) of component tasks.
 String getTrackingURL()
          Get the URL where some job progress information will be displayed.
 boolean isComplete()
          Check if the job is finished or not.
 boolean isSuccessful()
          Check if the job completed successfully.
 void killJob()
          Kill the running job.
 void killTask(TaskAttemptID taskId)
          Kill indicated task attempt.
 float mapProgress()
          Get the progress of the job's map-tasks, as a float between 0.0 and 1.0.
 float reduceProgress()
          Get the progress of the job's reduce-tasks, as a float between 0.0 and 1.0.
 void setCancelDelegationTokenUponJobCompletion(boolean value)
          Sets the flag that will allow the JobTracker to cancel the HDFS delegation tokens upon job completion.
 void setCombinerClass(Class<? extends Reducer> cls)
          Set the combiner class for the job.
 void setGroupingComparatorClass(Class<? extends RawComparator> cls)
          Define the comparator that controls which keys are grouped together for a single call to Reducer.reduce(Object, Iterable, org.apache.hadoop.mapreduce.Reducer.Context)
 void setInputFormatClass(Class<? extends InputFormat> cls)
          Set the InputFormat for the job.
 void setJarByClass(Class<?> cls)
          Set the Jar by finding where a given class came from.
 void setJobName(String name)
          Set the user-specified job name.
 void setMapOutputKeyClass(Class<?> theClass)
          Set the key class for the map output data.
 void setMapOutputValueClass(Class<?> theClass)
          Set the value class for the map output data.
 void setMapperClass(Class<? extends Mapper> cls)
          Set the Mapper for the job.
 void setMapSpeculativeExecution(boolean speculativeExecution)
          Turn speculative execution on or off for this job for map tasks.
 void setNumReduceTasks(int tasks)
          Set the number of reduce tasks for the job.
 void setOutputFormatClass(Class<? extends OutputFormat> cls)
          Set the OutputFormat for the job.
 void setOutputKeyClass(Class<?> theClass)
          Set the key class for the job output data.
 void setOutputValueClass(Class<?> theClass)
          Set the value class for job outputs.
 void setPartitionerClass(Class<? extends Partitioner> cls)
          Set the Partitioner for the job.
 void setReducerClass(Class<? extends Reducer> cls)
          Set the Reducer for the job.
 void setReduceSpeculativeExecution(boolean speculativeExecution)
          Turn speculative execution on or off for this job for reduce tasks.
 void setSortComparatorClass(Class<? extends RawComparator> cls)
          Define the comparator that controls how the keys are sorted before they are passed to the Reducer.
 void setSpeculativeExecution(boolean speculativeExecution)
          Turn speculative execution on or off for this job.
 float setupProgress()
          Get the progress of the job's setup, as a float between 0.0 and 1.0.
 void setWorkingDirectory(Path dir)
          Set the current working directory for the default file system.
 void submit()
          Submit the job to the cluster and return immediately.
 boolean waitForCompletion(boolean verbose)
          Submit the job to the cluster and wait for it to finish.
 
Methods inherited from class org.apache.hadoop.mapreduce.JobContext
getCombinerClass, getConfiguration, getCredentials, getGroupingComparator, getInputFormatClass, getJobID, getJobName, getMapOutputKeyClass, getMapOutputValueClass, getMapperClass, getNumReduceTasks, getOutputFormatClass, getOutputKeyClass, getOutputValueClass, getPartitionerClass, getReducerClass, getSortComparator, getWorkingDirectory
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Job

public Job()
    throws IOException
Throws:
IOException

Job

public Job(Configuration conf)
    throws IOException
Throws:
IOException

Job

public Job(Configuration conf,
           String jobName)
    throws IOException
Throws:
IOException
Method Detail

getInstance

public static Job getInstance()
                       throws IOException
Creates a new Job A Job will be created with a generic Configuration.

Returns:
the Job
Throws:
IOException

getInstance

public static Job getInstance(Configuration conf)
                       throws IOException
Creates a new Job with a given Configuration. The Job makes a copy of the Configuration so that any necessary internal modifications do not reflect on the incoming parameter.

Parameters:
conf - the Configuration
Returns:
the Job
Throws:
IOException

getInstance

public static Job getInstance(Configuration conf,
                              String jobName)
                       throws IOException
Creates a new Job with a given Configuration and a given jobName. The Job makes a copy of the Configuration so that any necessary internal modifications do not reflect on the incoming parameter.

Parameters:
conf - the Configuration
jobName - the job instance's name
Returns:
the Job
Throws:
IOException

setNumReduceTasks

public void setNumReduceTasks(int tasks)
                       throws IllegalStateException
Set the number of reduce tasks for the job.

Parameters:
tasks - the number of reduce tasks
Throws:
IllegalStateException - if the job is submitted

setWorkingDirectory

public void setWorkingDirectory(Path dir)
                         throws IOException
Set the current working directory for the default file system.

Parameters:
dir - the new current working directory.
Throws:
IllegalStateException - if the job is submitted
IOException

setInputFormatClass

public void setInputFormatClass(Class<? extends InputFormat> cls)
                         throws IllegalStateException
Set the InputFormat for the job.

Parameters:
cls - the InputFormat to use
Throws:
IllegalStateException - if the job is submitted

setOutputFormatClass

public void setOutputFormatClass(Class<? extends OutputFormat> cls)
                          throws IllegalStateException
Set the OutputFormat for the job.

Parameters:
cls - the OutputFormat to use
Throws:
IllegalStateException - if the job is submitted

setMapperClass

public void setMapperClass(Class<? extends Mapper> cls)
                    throws IllegalStateException
Set the Mapper for the job.

Parameters:
cls - the Mapper to use
Throws:
IllegalStateException - if the job is submitted

setJarByClass

public void setJarByClass(Class<?> cls)
Set the Jar by finding where a given class came from.

Parameters:
cls - the example class

getJar

public String getJar()
Get the pathname of the job's jar.

Overrides:
getJar in class JobContext
Returns:
the pathname

setCombinerClass

public void setCombinerClass(Class<? extends Reducer> cls)
                      throws IllegalStateException
Set the combiner class for the job.

Parameters:
cls - the combiner to use
Throws:
IllegalStateException - if the job is submitted

setReducerClass

public void setReducerClass(Class<? extends Reducer> cls)
                     throws IllegalStateException
Set the Reducer for the job.

Parameters:
cls - the Reducer to use
Throws:
IllegalStateException - if the job is submitted

setPartitionerClass

public void setPartitionerClass(Class<? extends Partitioner> cls)
                         throws IllegalStateException
Set the Partitioner for the job.

Parameters:
cls - the Partitioner to use
Throws:
IllegalStateException - if the job is submitted

setMapOutputKeyClass

public void setMapOutputKeyClass(Class<?> theClass)
                          throws IllegalStateException
Set the key class for the map output data. This allows the user to specify the map output key class to be different than the final output value class.

Parameters:
theClass - the map output key class.
Throws:
IllegalStateException - if the job is submitted

setMapOutputValueClass

public void setMapOutputValueClass(Class<?> theClass)
                            throws IllegalStateException
Set the value class for the map output data. This allows the user to specify the map output value class to be different than the final output value class.

Parameters:
theClass - the map output value class.
Throws:
IllegalStateException - if the job is submitted

setOutputKeyClass

public void setOutputKeyClass(Class<?> theClass)
                       throws IllegalStateException
Set the key class for the job output data.

Parameters:
theClass - the key class for the job output data.
Throws:
IllegalStateException - if the job is submitted

setOutputValueClass

public void setOutputValueClass(Class<?> theClass)
                         throws IllegalStateException
Set the value class for job outputs.

Parameters:
theClass - the value class for job outputs.
Throws:
IllegalStateException - if the job is submitted

setSortComparatorClass

public void setSortComparatorClass(Class<? extends RawComparator> cls)
                            throws IllegalStateException
Define the comparator that controls how the keys are sorted before they are passed to the Reducer.

Parameters:
cls - the raw comparator
Throws:
IllegalStateException - if the job is submitted

setGroupingComparatorClass

public void setGroupingComparatorClass(Class<? extends RawComparator> cls)
                                throws IllegalStateException
Define the comparator that controls which keys are grouped together for a single call to Reducer.reduce(Object, Iterable, org.apache.hadoop.mapreduce.Reducer.Context)

Parameters:
cls - the raw comparator to use
Throws:
IllegalStateException - if the job is submitted

setJobName

public void setJobName(String name)
                throws IllegalStateException
Set the user-specified job name.

Parameters:
name - the job's new name.
Throws:
IllegalStateException - if the job is submitted

setSpeculativeExecution

public void setSpeculativeExecution(boolean speculativeExecution)
Turn speculative execution on or off for this job.

Parameters:
speculativeExecution - true if speculative execution should be turned on, else false.

setMapSpeculativeExecution

public void setMapSpeculativeExecution(boolean speculativeExecution)
Turn speculative execution on or off for this job for map tasks.

Parameters:
speculativeExecution - true if speculative execution should be turned on for map tasks, else false.

setReduceSpeculativeExecution

public void setReduceSpeculativeExecution(boolean speculativeExecution)
Turn speculative execution on or off for this job for reduce tasks.

Parameters:
speculativeExecution - true if speculative execution should be turned on for reduce tasks, else false.

getTrackingURL

public String getTrackingURL()
Get the URL where some job progress information will be displayed.

Returns:
the URL where some job progress information will be displayed.

setupProgress

public float setupProgress()
                    throws IOException
Get the progress of the job's setup, as a float between 0.0 and 1.0. When the job setup is completed, the function returns 1.0.

Returns:
the progress of the job's setup.
Throws:
IOException

mapProgress

public float mapProgress()
                  throws IOException
Get the progress of the job's map-tasks, as a float between 0.0 and 1.0. When all map tasks have completed, the function returns 1.0.

Returns:
the progress of the job's map-tasks.
Throws:
IOException

reduceProgress

public float reduceProgress()
                     throws IOException
Get the progress of the job's reduce-tasks, as a float between 0.0 and 1.0. When all reduce tasks have completed, the function returns 1.0.

Returns:
the progress of the job's reduce-tasks.
Throws:
IOException

isComplete

public boolean isComplete()
                   throws IOException
Check if the job is finished or not. This is a non-blocking call.

Returns:
true if the job is complete, else false.
Throws:
IOException

isSuccessful

public boolean isSuccessful()
                     throws IOException
Check if the job completed successfully.

Returns:
true if the job succeeded, else false.
Throws:
IOException

killJob

public void killJob()
             throws IOException
Kill the running job. Blocks until all job tasks have been killed as well. If the job is no longer running, it simply returns.

Throws:
IOException

getTaskCompletionEvents

public TaskCompletionEvent[] getTaskCompletionEvents(int startFrom)
                                              throws IOException
Get events indicating completion (success/failure) of component tasks.

Parameters:
startFrom - index to start fetching events from
Returns:
an array of TaskCompletionEvents
Throws:
IOException

killTask

public void killTask(TaskAttemptID taskId)
              throws IOException
Kill indicated task attempt.

Parameters:
taskId - the id of the task to be terminated.
Throws:
IOException

failTask

public void failTask(TaskAttemptID taskId)
              throws IOException
Fail indicated task attempt.

Parameters:
taskId - the id of the task to be terminated.
Throws:
IOException

getCounters

public Counters getCounters()
                     throws IOException
Gets the counters for this job.

Returns:
the counters for this job.
Throws:
IOException

setCancelDelegationTokenUponJobCompletion

public void setCancelDelegationTokenUponJobCompletion(boolean value)
Sets the flag that will allow the JobTracker to cancel the HDFS delegation tokens upon job completion. Defaults to true.


submit

public void submit()
            throws IOException,
                   InterruptedException,
                   ClassNotFoundException
Submit the job to the cluster and return immediately.

Throws:
IOException
InterruptedException
ClassNotFoundException

waitForCompletion

public boolean waitForCompletion(boolean verbose)
                          throws IOException,
                                 InterruptedException,
                                 ClassNotFoundException
Submit the job to the cluster and wait for it to finish.

Parameters:
verbose - print the progress to the user
Returns:
true if the job succeeded
Throws:
IOException - thrown if the communication with the JobTracker is lost
InterruptedException
ClassNotFoundException


Copyright © 2009 The Apache Software Foundation