|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.hadoop.mapreduce.OutputCommitter org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
@InterfaceAudience.Public @InterfaceStability.Stable public class FileOutputCommitter
An OutputCommitter
that commits files specified
in job output directory i.e. ${mapreduce.output.fileoutputformat.outputdir}.
Field Summary | |
---|---|
static String |
PENDING_DIR_NAME
Name of directory where pending data is placed. |
static String |
SUCCEEDED_FILE_NAME
|
static String |
SUCCESSFUL_JOB_OUTPUT_DIR_MARKER
|
protected static String |
TEMP_DIR_NAME
Deprecated. |
Constructor Summary | |
---|---|
FileOutputCommitter(Path outputPath,
JobContext context)
Create a file output committer |
|
FileOutputCommitter(Path outputPath,
TaskAttemptContext context)
Create a file output committer |
Method Summary | |
---|---|
void |
abortJob(JobContext context,
org.apache.hadoop.mapreduce.JobStatus.State state)
Delete the temporary directory, including all of the work directories. |
void |
abortTask(TaskAttemptContext context)
Delete the work directory |
void |
cleanupJob(JobContext context)
Deprecated. |
void |
commitJob(JobContext context)
The job has completed so move all committed tasks to the final output dir. |
void |
commitTask(TaskAttemptContext context)
Move the files from the work directory to the job output directory |
protected Path |
getCommittedTaskPath(int appAttemptId,
TaskAttemptContext context)
Compute the path where the output of a committed task is stored until the entire job is committed for a specific application attempt. |
Path |
getCommittedTaskPath(TaskAttemptContext context)
Compute the path where the output of a committed task is stored until the entire job is committed. |
static Path |
getCommittedTaskPath(TaskAttemptContext context,
Path out)
|
protected Path |
getJobAttemptPath(int appAttemptId)
Compute the path where the output of a given job attempt will be placed. |
Path |
getJobAttemptPath(JobContext context)
Compute the path where the output of a given job attempt will be placed. |
static Path |
getJobAttemptPath(JobContext context,
Path out)
Compute the path where the output of a given job attempt will be placed. |
Path |
getTaskAttemptPath(TaskAttemptContext context)
Compute the path where the output of a task attempt is stored until that task is committed. |
static Path |
getTaskAttemptPath(TaskAttemptContext context,
Path out)
Compute the path where the output of a task attempt is stored until that task is committed. |
Path |
getWorkPath()
Get the directory that the task should write results into. |
boolean |
isRecoverySupported()
Is task output recovery supported for restarting jobs? If task output recovery is supported, job restart can be done more efficiently. |
boolean |
needsTaskCommit(TaskAttemptContext context)
Did this task write any files in the work directory? |
void |
recoverTask(TaskAttemptContext context)
Recover the task output. |
void |
setupJob(JobContext context)
Create the temporary directory that is the root of all of the task work directories. |
void |
setupTask(TaskAttemptContext context)
No task setup required. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final String PENDING_DIR_NAME
@Deprecated protected static final String TEMP_DIR_NAME
public static final String SUCCEEDED_FILE_NAME
public static final String SUCCESSFUL_JOB_OUTPUT_DIR_MARKER
Constructor Detail |
---|
public FileOutputCommitter(Path outputPath, TaskAttemptContext context) throws IOException
outputPath
- the job's output path, or null if you want the output
committer to act as a noop.context
- the task's context
IOException
@InterfaceAudience.Private public FileOutputCommitter(Path outputPath, JobContext context) throws IOException
outputPath
- the job's output path, or null if you want the output
committer to act as a noop.context
- the task's context
IOException
Method Detail |
---|
public Path getJobAttemptPath(JobContext context)
context
- the context of the job. This is used to get the
application attempt id.
public static Path getJobAttemptPath(JobContext context, Path out)
context
- the context of the job. This is used to get the
application attempt id.out
- the output path to place these in.
protected Path getJobAttemptPath(int appAttemptId)
appAttemptId
- the ID of the application attempt for this job.
public Path getTaskAttemptPath(TaskAttemptContext context)
context
- the context of the task attempt.
public static Path getTaskAttemptPath(TaskAttemptContext context, Path out)
context
- the context of the task attempt.out
- The output path to put things in.
public Path getCommittedTaskPath(TaskAttemptContext context)
context
- the context of the task attempt
public static Path getCommittedTaskPath(TaskAttemptContext context, Path out)
protected Path getCommittedTaskPath(int appAttemptId, TaskAttemptContext context)
appAttemptId
- the id of the application attempt to usecontext
- the context of any task.
public Path getWorkPath() throws IOException
IOException
public void setupJob(JobContext context) throws IOException
setupJob
in class OutputCommitter
context
- the job's context
IOException
- if temporary output could not be createdpublic void commitJob(JobContext context) throws IOException
commitJob
in class OutputCommitter
context
- the job's context
IOException
@Deprecated public void cleanupJob(JobContext context) throws IOException
OutputCommitter
cleanupJob
in class OutputCommitter
context
- Context of the job whose output is being written.
IOException
public void abortJob(JobContext context, org.apache.hadoop.mapreduce.JobStatus.State state) throws IOException
abortJob
in class OutputCommitter
context
- the job's contextstate
- final runstate of the job
IOException
public void setupTask(TaskAttemptContext context) throws IOException
setupTask
in class OutputCommitter
context
- Context of the task whose output is being written.
IOException
public void commitTask(TaskAttemptContext context) throws IOException
commitTask
in class OutputCommitter
context
- the task context
IOException
- if commit is not successful.public void abortTask(TaskAttemptContext context) throws IOException
abortTask
in class OutputCommitter
IOException
public boolean needsTaskCommit(TaskAttemptContext context) throws IOException
needsTaskCommit
in class OutputCommitter
context
- the task's context
IOException
public boolean isRecoverySupported()
OutputCommitter
isRecoverySupported
in class OutputCommitter
true
if task output recovery is supported,
false
otherwiseOutputCommitter.recoverTask(TaskAttemptContext)
public void recoverTask(TaskAttemptContext context) throws IOException
OutputCommitter
MRJobConfig.APPLICATION_ATTEMPT_ID
key in
JobContext.getConfiguration()
for the
OutputCommitter
. This is called from the application master
process, but it is called individually for each task.
If an exception is thrown the task will be attempted again.
This may be called multiple times for the same task. But from different
application attempts.
recoverTask
in class OutputCommitter
context
- Context of the task whose output is being recovered
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |