DBInputFormat (Apache Hadoop Main 3.3.1 API)

java.lang.Object
- org.apache.hadoop.mapreduce.InputFormat<LongWritable,T>
- - org.apache.hadoop.mapreduce.lib.db.DBInputFormat<T>

All Implemented Interfaces:

Configurable

Direct Known Subclasses:

DataDrivenDBInputFormat, DBInputFormat
```
@InterfaceAudience.Public
 @InterfaceStability.Stable
public class DBInputFormat<T extends DBWritable>
extends InputFormat<LongWritable,T>
implements Configurable
```
A InputFormat that reads input data from an SQL table.
DBInputFormat emits LongWritables containing the record number as key and DBWritables as value. The SQL query, and input class can be using one of the two setInput methods.

Field Summary

Fields
Modifier and Type	Field and Description
`protected String`	`conditions`
`protected Connection`	`connection`
`protected DBConfiguration`	`dbConf`
`protected String`	`dbProductName`
`protected String[]`	`fieldNames`
`protected String`	`tableName`

Constructor Summary

Constructors
Constructor and Description

DBInputFormat()

Constructors
Constructor and Description
`DBInputFormat()`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected void`	`closeConnection()`
`Connection`	`createConnection()`
`protected RecordReader<LongWritable,T>`	`createDBRecordReader(org.apache.hadoop.mapreduce.lib.db.DBInputFormat.DBInputSplit split, Configuration conf)`
`RecordReader<LongWritable,T>`	`createRecordReader(InputSplit split, TaskAttemptContext context)` Create a record reader for a given split.
`Configuration`	`getConf()` Return the configuration used by this object.
`Connection`	`getConnection()`
`protected String`	`getCountQuery()` Returns the query for getting the total number of rows, subclasses can override this for custom behaviour.
`DBConfiguration`	`getDBConf()`
`String`	`getDBProductName()`
`List<InputSplit>`	`getSplits(JobContext job)` Logically split the set of input files for the job.
`void`	`setConf(Configuration conf)` Set the configuration to be used by this object.
`static void`	`setInput(Job job, Class<? extends DBWritable> inputClass, String inputQuery, String inputCountQuery)` Initializes the map-part of the job with the appropriate input settings.
`static void`	`setInput(Job job, Class<? extends DBWritable> inputClass, String tableName, String conditions, String orderBy, String... fieldNames)` Initializes the map-part of the job with the appropriate input settings.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - dbProductName
```
protected String dbProductName
```
  - conditions
```
protected String conditions
```
  - connection
```
protected Connection connection
```
  - tableName
```
protected String tableName
```
  - fieldNames
```
protected String[] fieldNames
```
  - dbConf
```
protected DBConfiguration dbConf
```
- Constructor Detail
  - DBInputFormat
```
public DBInputFormat()
```
- Method Detail
  - setConf
```
public void setConf(Configuration conf)
```
    Set the configuration to be used by this object.
    
    Specified by:
    
    setConf in interface Configurable
    
    Parameters:
    
    conf - configuration to be used
  - getConf
```
public Configuration getConf()
```
    Description copied from interface: Configurable
    
    Return the configuration used by this object.
    
    Specified by:
    
    getConf in interface Configurable
    
    Returns:
    
    Configuration
  - getDBConf
```
public DBConfiguration getDBConf()
```
  - getConnection
```
public Connection getConnection()
```
  - createConnection
```
public Connection createConnection()
```
  - getDBProductName
```
public String getDBProductName()
```
  - createDBRecordReader
```
protected RecordReader<LongWritable,T> createDBRecordReader(org.apache.hadoop.mapreduce.lib.db.DBInputFormat.DBInputSplit split,
                                                            Configuration conf)
                                                     throws IOException
```
    Throws:
    
    IOException
  - createRecordReader
```
public RecordReader<LongWritable,T> createRecordReader(InputSplit split,
                                                       TaskAttemptContext context)
                                                throws IOException,
                                                       InterruptedException
```
    Create a record reader for a given split. The framework will call RecordReader.initialize(InputSplit, TaskAttemptContext) before the split is used.
    
    Specified by:
    
    createRecordReader in class InputFormat<LongWritable,T extends DBWritable>
    
    Parameters:
    
    split - the split to be read
    
    context - the information about the task
    
    Returns:
    
    a new record reader
    
    Throws:
    
    IOException
    
    InterruptedException
  - getSplits
```
public List<InputSplit> getSplits(JobContext job)
                           throws IOException
```
    Logically split the set of input files for the job.
    Each InputSplit is then assigned to an individual Mapper for processing.
    
    Note: The split is a logical split of the inputs and the input files are not physically split into chunks. For e.g. a split could be <input-file-path, start, offset> tuple. The InputFormat also creates the RecordReader to read the InputSplit.
    
    Specified by:
    
    getSplits in class InputFormat<LongWritable,T extends DBWritable>
    
    Parameters:
    
    job - job configuration.
    
    Returns:
    
    an array of InputSplits for the job.
    
    Throws:
    
    IOException
  - getCountQuery
```
protected String getCountQuery()
```
    Returns the query for getting the total number of rows, subclasses can override this for custom behaviour.
  - setInput
```
public static void setInput(Job job,
                            Class<? extends DBWritable> inputClass,
                            String tableName,
                            String conditions,
                            String orderBy,
                            String... fieldNames)
```
    Initializes the map-part of the job with the appropriate input settings.
    
    Parameters:
    
    job - The map-reduce job
    
    inputClass - the class object implementing DBWritable, which is the Java object holding tuple fields.
    
    tableName - The table to read data from
    
    conditions - The condition which to select data with, eg. '(updated > 20070101 AND length > 0)'
    
    orderBy - the fieldNames in the orderBy clause.
    
    fieldNames - The field names in the table
    
    See Also:
    
    setInput(Job, Class, String, String)
  - setInput
```
public static void setInput(Job job,
                            Class<? extends DBWritable> inputClass,
                            String inputQuery,
                            String inputCountQuery)
```
    Initializes the map-part of the job with the appropriate input settings.
    
    Parameters:
    
    job - The map-reduce job
    
    inputClass - the class object implementing DBWritable, which is the Java object holding tuple fields.
    
    inputQuery - the input query to select fields. Example : "SELECT f1, f2, f3 FROM Mytable ORDER BY f1"
    
    inputCountQuery - the input query that returns the number of records in the table. Example : "SELECT COUNT(f1) FROM Mytable"
    
    See Also:
    
    setInput(Job, Class, String, String, String, String...)
  - closeConnection
```
protected void closeConnection()
```

Class DBInputFormat<T extends DBWritable>

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

dbProductName

conditions

connection

tableName

fieldNames

dbConf

Constructor Detail

DBInputFormat

Method Detail

setConf

getConf

getDBConf

getConnection

createConnection

getDBProductName

createDBRecordReader

createRecordReader

getSplits

getCountQuery

setInput

setInput

closeConnection