Package org.apache.hadoop.mapreduce
Class OutputFormat<K,V>
java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<K,V>
- Direct Known Subclasses:
DBOutputFormat,FileOutputFormat,FilterOutputFormat,NullOutputFormat
OutputFormat describes the output-specification for a
Map-Reduce job.
The Map-Reduce framework relies on the OutputFormat of the
job to:
- Validate the output-specification of the job. For e.g. check that the output directory doesn't already exist.
-
Provide the
RecordWriterimplementation to be used to write out the output files of the job. Output files are stored in aFileSystem.
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionabstract voidcheckOutputSpecs(JobContext context) Check for validity of the output-specification for the job.abstract OutputCommittergetOutputCommitter(TaskAttemptContext context) Get the output committer for this output format.abstract RecordWriter<K,V> getRecordWriter(TaskAttemptContext context) Get theRecordWriterfor the given task.
-
Constructor Details
-
OutputFormat
public OutputFormat()
-
-
Method Details
-
getRecordWriter
public abstract RecordWriter<K,V> getRecordWriter(TaskAttemptContext context) throws IOException, InterruptedException Get theRecordWriterfor the given task.- Parameters:
context- the information about the current task.- Returns:
- a
RecordWriterto write the output for the job. - Throws:
IOExceptionInterruptedException
-
checkOutputSpecs
Check for validity of the output-specification for the job.This is to validate the output specification for the job when it is a job is submitted. Typically checks that it does not already exist, throwing an exception when it already exists, so that output is not overwritten.
Implementations which write to filesystems which support delegation tokens usually collect the tokens for the destination path(s) and attach them to the job context's JobConf.- Parameters:
context- information about the job- Throws:
IOException- when output should not be attemptedInterruptedException
-
getOutputCommitter
public abstract OutputCommitter getOutputCommitter(TaskAttemptContext context) throws IOException, InterruptedException Get the output committer for this output format. This is responsible for ensuring the output is committed correctly.- Parameters:
context- the task context- Returns:
- an output committer
- Throws:
IOExceptionInterruptedException
-