Class DBOutputFormat<K extends DBWritable,V>
java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<K,V>
org.apache.hadoop.mapreduce.lib.db.DBOutputFormat<K,V>
- Direct Known Subclasses:
DBOutputFormat
A OutputFormat that sends the reduce output to a SQL table.
DBOutputFormat accepts <key,value> pairs, where
key has a type extending DBWritable. Returned RecordWriter
writes only the key to the database with a batch SQL query.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionclassorg.apache.hadoop.mapreduce.lib.db.DBOutputFormat.DBRecordWriterA RecordWriter that writes the reduce output to a SQL table -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcheckOutputSpecs(JobContext context) Check for validity of the output-specification for the job.constructQuery(String table, String[] fieldNames) Constructs the query used as the prepared statement to insert data.getOutputCommitter(TaskAttemptContext context) Get the output committer for this output format.getRecordWriter(TaskAttemptContext context) Get theRecordWriterfor the given task.static voidInitializes the reduce-part of the job with the appropriate output settingsstatic voidInitializes the reduce-part of the job with the appropriate output settings
-
Field Details
-
dbProductName
-
-
Constructor Details
-
DBOutputFormat
public DBOutputFormat()
-
-
Method Details
-
checkOutputSpecs
Description copied from class:OutputFormatCheck for validity of the output-specification for the job.This is to validate the output specification for the job when it is a job is submitted. Typically checks that it does not already exist, throwing an exception when it already exists, so that output is not overwritten.
Implementations which write to filesystems which support delegation tokens usually collect the tokens for the destination path(s) and attach them to the job context's JobConf.- Specified by:
checkOutputSpecsin classOutputFormat<K extends DBWritable,V> - Parameters:
context- information about the job- Throws:
IOException- when output should not be attemptedInterruptedException
-
getOutputCommitter
public OutputCommitter getOutputCommitter(TaskAttemptContext context) throws IOException, InterruptedException Description copied from class:OutputFormatGet the output committer for this output format. This is responsible for ensuring the output is committed correctly.- Specified by:
getOutputCommitterin classOutputFormat<K extends DBWritable,V> - Parameters:
context- the task context- Returns:
- an output committer
- Throws:
IOExceptionInterruptedException
-
constructQuery
Constructs the query used as the prepared statement to insert data.- Parameters:
table- the table to insert intofieldNames- the fields to insert into. If field names are unknown, supply an array of nulls.
-
getRecordWriter
Get theRecordWriterfor the given task.- Specified by:
getRecordWriterin classOutputFormat<K extends DBWritable,V> - Parameters:
context- the information about the current task.- Returns:
- a
RecordWriterto write the output for the job. - Throws:
IOException
-
setOutput
Initializes the reduce-part of the job with the appropriate output settings- Parameters:
job- The jobtableName- The table to insert data intofieldNames- The field names in the table.- Throws:
IOException
-
setOutput
Initializes the reduce-part of the job with the appropriate output settings- Parameters:
job- The jobtableName- The table to insert data intofieldCount- the number of fields in the table.- Throws:
IOException
-