@InterfaceAudience.Public @InterfaceStability.Evolving public class DataDrivenDBInputFormat<T extends DBWritable> extends DBInputFormat<T> implements Configurable
| Modifier and Type | Field and Description |
|---|---|
static String |
SUBSTITUTE_TOKEN
If users are providing their own query, the following string is expected to
appear in the WHERE clause, which will be substituted with a pair of conditions
on the input to allow input splits to parallelise the import.
|
conditions, connection, dbConf, dbProductName, fieldNames, tableName| Constructor and Description |
|---|
DataDrivenDBInputFormat() |
| Modifier and Type | Method and Description |
|---|---|
protected RecordReader<LongWritable,T> |
createDBRecordReader(org.apache.hadoop.mapreduce.lib.db.DBInputFormat.DBInputSplit split,
Configuration conf) |
protected String |
getBoundingValsQuery() |
List<InputSplit> |
getSplits(JobContext job)
Logically split the set of input files for the job.
|
protected DBSplitter |
getSplitter(int sqlDataType) |
static void |
setBoundingQuery(Configuration conf,
String query)
Set the user-defined bounding query to use with a user-defined query.
|
static void |
setInput(Job job,
Class<? extends DBWritable> inputClass,
String inputQuery,
String inputBoundingQuery)
setInput() takes a custom query and a separate "bounding query" to use
instead of the custom "count query" used by DBInputFormat.
|
static void |
setInput(Job job,
Class<? extends DBWritable> inputClass,
String tableName,
String conditions,
String splitBy,
String... fieldNames)
Note that the "orderBy" column is called the "splitBy" in this version.
|
closeConnection, createConnection, createRecordReader, getConf, getConnection, getCountQuery, getDBConf, getDBProductName, setConfclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetConf, setConfpublic static final String SUBSTITUTE_TOKEN
protected DBSplitter getSplitter(int sqlDataType)
public List<InputSplit> getSplits(JobContext job) throws IOException
Each InputSplit is then assigned to an individual Mapper
for processing.
Note: The split is a logical split of the inputs and the
input files are not physically split into chunks. For e.g. a split could
be <input-file-path, start, offset> tuple. The InputFormat
also creates the RecordReader to read the InputSplit.
getSplits in class DBInputFormat<T extends DBWritable>job - job configuration.InputSplits for the job.IOExceptionprotected String getBoundingValsQuery()
public static void setBoundingQuery(Configuration conf, String query)
protected RecordReader<LongWritable,T> createDBRecordReader(org.apache.hadoop.mapreduce.lib.db.DBInputFormat.DBInputSplit split, Configuration conf) throws IOException
createDBRecordReader in class DBInputFormat<T extends DBWritable>IOExceptionpublic static void setInput(Job job, Class<? extends DBWritable> inputClass, String tableName, String conditions, String splitBy, String... fieldNames)
public static void setInput(Job job, Class<? extends DBWritable> inputClass, String inputQuery, String inputBoundingQuery)
Copyright © 2017 Apache Software Foundation. All rights reserved.