| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapreduce.InputFormat<LongWritable,T>
org.apache.hadoop.mapreduce.lib.db.DBInputFormat<T>
org.apache.hadoop.mapreduce.lib.db.DataDrivenDBInputFormat<T>
@InterfaceAudience.Public @InterfaceStability.Evolving public class DataDrivenDBInputFormat<T extends DBWritable>
A InputFormat that reads input data from an SQL table. Operates like DBInputFormat, but instead of using LIMIT and OFFSET to demarcate splits, it tries to generate WHERE clauses which separate the data into roughly equivalent shards.
| Field Summary | |
|---|---|
| static String | SUBSTITUTE_TOKENIf users are providing their own query, the following string is expected to appear in the WHERE clause, which will be substituted with a pair of conditions on the input to allow input splits to parallelise the import. | 
| Fields inherited from class org.apache.hadoop.mapreduce.lib.db.DBInputFormat | 
|---|
| conditions, connection, dbConf, dbProductName, fieldNames, tableName | 
| Constructor Summary | |
|---|---|
| DataDrivenDBInputFormat() | |
| Method Summary | |
|---|---|
| protected  RecordReader<LongWritable,T> | createDBRecordReader(org.apache.hadoop.mapreduce.lib.db.DBInputFormat.DBInputSplit split,
                                         Configuration conf) | 
| protected  String | getBoundingValsQuery() | 
|  List<InputSplit> | getSplits(JobContext job)Logically split the set of input files for the job. | 
| protected  DBSplitter | getSplitter(int sqlDataType) | 
| static void | setBoundingQuery(Configuration conf,
                                 String query)Set the user-defined bounding query to use with a user-defined query. | 
| static void | setInput(Job job,
                 Class<? extends DBWritable> inputClass,
                 String inputQuery,
                 String inputBoundingQuery)setInput() takes a custom query and a separate "bounding query" to use instead of the custom "count query" used by DBInputFormat. | 
| static void | setInput(Job job,
                 Class<? extends DBWritable> inputClass,
                 String tableName,
                 String conditions,
                 String splitBy,
                 String... fieldNames)Note that the "orderBy" column is called the "splitBy" in this version. | 
| Methods inherited from class org.apache.hadoop.mapreduce.lib.db.DBInputFormat | 
|---|
| closeConnection, createRecordReader, getConf, getConnection, getCountQuery, getDBConf, getDBProductName, setConf | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Methods inherited from interface org.apache.hadoop.conf.Configurable | 
|---|
| getConf, setConf | 
| Field Detail | 
|---|
public static final String SUBSTITUTE_TOKEN
| Constructor Detail | 
|---|
public DataDrivenDBInputFormat()
| Method Detail | 
|---|
protected DBSplitter getSplitter(int sqlDataType)
public List<InputSplit> getSplits(JobContext job)
                           throws IOException
Each InputSplit is then assigned to an individual Mapper
 for processing.
Note: The split is a logical split of the inputs and the
 input files are not physically split into chunks. For e.g. a split could
 be <input-file-path, start, offset> tuple. The InputFormat
 also creates the RecordReader to read the InputSplit.
getSplits in class DBInputFormat<T extends DBWritable>job - job configuration.
InputSplits for the job.
IOExceptionprotected String getBoundingValsQuery()
public static void setBoundingQuery(Configuration conf,
                                    String query)
protected RecordReader<LongWritable,T> createDBRecordReader(org.apache.hadoop.mapreduce.lib.db.DBInputFormat.DBInputSplit split,
                                                            Configuration conf)
                                                                        throws IOException
createDBRecordReader in class DBInputFormat<T extends DBWritable>IOException
public static void setInput(Job job,
                            Class<? extends DBWritable> inputClass,
                            String tableName,
                            String conditions,
                            String splitBy,
                            String... fieldNames)
public static void setInput(Job job,
                            Class<? extends DBWritable> inputClass,
                            String inputQuery,
                            String inputBoundingQuery)
| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||