org.apache.hadoop.mapreduce
Class OutputCommitter

java.lang.Object
  extended by org.apache.hadoop.mapreduce.OutputCommitter
Direct Known Subclasses:
FileOutputCommitter, OutputCommitter

public abstract class OutputCommitter
extends Object

OutputCommitter describes the commit of task output for a Map-Reduce job.

The Map-Reduce framework relies on the OutputCommitter of the job to:

  1. Setup the job during initialization. For example, create the temporary output directory for the job during the initialization of the job.
  2. Cleanup the job after the job completion. For example, remove the temporary output directory after the job completion.
  3. Setup the task temporary output.
  4. Check whether a task needs a commit. This is to avoid the commit procedure if a task does not need commit.
  5. Commit of the task output.
  6. Discard the task commit.

See Also:
FileOutputCommitter, JobContext, TaskAttemptContext

Constructor Summary
OutputCommitter()
           
 
Method Summary
 void abortJob(JobContext jobContext, JobStatus.State state)
          For aborting an unsuccessful job's output.
abstract  void abortTask(TaskAttemptContext taskContext)
          Discard the task output
 void cleanupJob(JobContext context)
          Deprecated. use commitJob(JobContext) or abortJob(JobContext, JobStatus.State) instead
 void commitJob(JobContext jobContext)
          For cleaning up the job's output after job completion.
abstract  void commitTask(TaskAttemptContext taskContext)
          To promote the task's temporary output to final output location The task's output is moved to the job's output directory.
abstract  boolean needsTaskCommit(TaskAttemptContext taskContext)
          Check whether task needs a commit
abstract  void setupJob(JobContext jobContext)
          For the framework to setup the job output during initialization
abstract  void setupTask(TaskAttemptContext taskContext)
          Sets up output for the task.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

OutputCommitter

public OutputCommitter()
Method Detail

setupJob

public abstract void setupJob(JobContext jobContext)
                       throws IOException
For the framework to setup the job output during initialization

Parameters:
jobContext - Context of the job whose output is being written.
Throws:
IOException - if temporary output could not be created

commitJob

public void commitJob(JobContext jobContext)
               throws IOException
For cleaning up the job's output after job completion. Note that this is invoked for jobs with final run state as JobStatus.State.SUCCEEDED

Parameters:
jobContext - Context of the job whose output is being written.
Throws:
IOException

cleanupJob

@Deprecated
public void cleanupJob(JobContext context)
                throws IOException
Deprecated. use commitJob(JobContext) or abortJob(JobContext, JobStatus.State) instead

For cleaning up the job's output after job completion

Throws:
IOException

abortJob

public void abortJob(JobContext jobContext,
                     JobStatus.State state)
              throws IOException
For aborting an unsuccessful job's output. Note that this is invoked for jobs with final run state as JobStatus.State.FAILED or JobStatus.State.KILLED.

Parameters:
jobContext - Context of the job whose output is being written.
state - final run state of the job, should be either JobStatus.State.KILLED or JobStatus.State.FAILED
Throws:
IOException

setupTask

public abstract void setupTask(TaskAttemptContext taskContext)
                        throws IOException
Sets up output for the task.

Parameters:
taskContext - Context of the task whose output is being written.
Throws:
IOException

needsTaskCommit

public abstract boolean needsTaskCommit(TaskAttemptContext taskContext)
                                 throws IOException
Check whether task needs a commit

Parameters:
taskContext -
Returns:
true/false
Throws:
IOException

commitTask

public abstract void commitTask(TaskAttemptContext taskContext)
                         throws IOException
To promote the task's temporary output to final output location The task's output is moved to the job's output directory.

Parameters:
taskContext - Context of the task whose output is being written.
Throws:
IOException - if commit is not

abortTask

public abstract void abortTask(TaskAttemptContext taskContext)
                        throws IOException
Discard the task output

Parameters:
taskContext -
Throws:
IOException


Copyright © 2009 The Apache Software Foundation