Class LazyOutputFormat<K,V>
java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<K,V>
org.apache.hadoop.mapreduce.lib.output.FilterOutputFormat<K,V>
org.apache.hadoop.mapreduce.lib.output.LazyOutputFormat<K,V>
A Convenience class that creates output lazily.
Use in conjuction with org.apache.hadoop.mapreduce.lib.output.MultipleOutputs to recreate the
behaviour of org.apache.hadoop.mapred.lib.MultipleTextOutputFormat (etc) of the old Hadoop API.
See
MultipleOutputs documentation for more information.-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FilterOutputFormat
org.apache.hadoop.mapreduce.lib.output.FilterOutputFormat.FilterRecordWriter<K,V> -
Field Summary
FieldsFields inherited from class org.apache.hadoop.mapreduce.lib.output.FilterOutputFormat
baseOut -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcheckOutputSpecs(JobContext context) Check for validity of the output-specification for the job.getOutputCommitter(TaskAttemptContext context) Get the output committer for this output format.getRecordWriter(TaskAttemptContext context) Get theRecordWriterfor the given task.static voidsetOutputFormatClass(Job job, Class<? extends OutputFormat> theClass) Set the underlying output format for LazyOutputFormat.
-
Field Details
-
OUTPUT_FORMAT
-
-
Constructor Details
-
LazyOutputFormat
public LazyOutputFormat()
-
-
Method Details
-
setOutputFormatClass
Set the underlying output format for LazyOutputFormat.- Parameters:
job- theJobto modifytheClass- the underlying class
-
getRecordWriter
public RecordWriter<K,V> getRecordWriter(TaskAttemptContext context) throws IOException, InterruptedException Description copied from class:OutputFormatGet theRecordWriterfor the given task.- Overrides:
getRecordWriterin classFilterOutputFormat<K,V> - Parameters:
context- the information about the current task.- Returns:
- a
RecordWriterto write the output for the job. - Throws:
IOExceptionInterruptedException
-
checkOutputSpecs
Description copied from class:OutputFormatCheck for validity of the output-specification for the job.This is to validate the output specification for the job when it is a job is submitted. Typically checks that it does not already exist, throwing an exception when it already exists, so that output is not overwritten.
Implementations which write to filesystems which support delegation tokens usually collect the tokens for the destination path(s) and attach them to the job context's JobConf.- Overrides:
checkOutputSpecsin classFilterOutputFormat<K,V> - Parameters:
context- information about the job- Throws:
IOException- when output should not be attemptedInterruptedException
-
getOutputCommitter
public OutputCommitter getOutputCommitter(TaskAttemptContext context) throws IOException, InterruptedException Description copied from class:OutputFormatGet the output committer for this output format. This is responsible for ensuring the output is committed correctly.- Overrides:
getOutputCommitterin classFilterOutputFormat<K,V> - Parameters:
context- the task context- Returns:
- an output committer
- Throws:
IOExceptionInterruptedException
-