|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.hadoop.mapred.lib.db.DBInputFormat<T>
public class DBInputFormat<T extends DBWritable>
A InputFormat that reads input data from an SQL table.
DBInputFormat emits LongWritables containing the record number as key and DBWritables as value. The SQL query, and input class can be using one of the two setInput methods.
Nested Class Summary | |
---|---|
protected static class |
DBInputFormat.DBInputSplit
A InputSplit that spans a set of rows |
protected class |
DBInputFormat.DBRecordReader
A RecordReader that reads records from a SQL table. |
static class |
DBInputFormat.NullDBWritable
A Class that does nothing, implementing DBWritable |
Constructor Summary | |
---|---|
DBInputFormat()
|
Method Summary | |
---|---|
void |
configure(JobConf job)
Initializes a new instance from a JobConf . |
protected String |
getCountQuery()
Returns the query for getting the total number of rows, subclasses can override this for custom behaviour. |
RecordReader<LongWritable,T> |
getRecordReader(InputSplit split,
JobConf job,
Reporter reporter)
Get the RecordReader for the given InputSplit . |
InputSplit[] |
getSplits(JobConf job,
int chunks)
Logically split the set of input files for the job. |
static void |
setInput(JobConf job,
Class<? extends DBWritable> inputClass,
String inputQuery,
String inputCountQuery)
Initializes the map-part of the job with the appropriate input settings. |
static void |
setInput(JobConf job,
Class<? extends DBWritable> inputClass,
String tableName,
String conditions,
String orderBy,
String... fieldNames)
Initializes the map-part of the job with the appropriate input settings. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public DBInputFormat()
Method Detail |
---|
public void configure(JobConf job)
JobConf
.
configure
in interface JobConfigurable
job
- the configurationpublic RecordReader<LongWritable,T> getRecordReader(InputSplit split, JobConf job, Reporter reporter) throws IOException
RecordReader
for the given InputSplit
.
It is the responsibility of the RecordReader
to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
getRecordReader
in interface InputFormat<LongWritable,T extends DBWritable>
split
- the InputSplit
job
- the job that this split belongs to
RecordReader
IOException
public InputSplit[] getSplits(JobConf job, int chunks) throws IOException
Each InputSplit
is then assigned to an individual Mapper
for processing.
Note: The split is a logical split of the inputs and the input files are not physically split into chunks. For e.g. a split could be <input-file-path, start, offset> tuple.
getSplits
in interface InputFormat<LongWritable,T extends DBWritable>
job
- job configuration.chunks
- the desired number of splits, a hint.
InputSplit
s for the job.
IOException
protected String getCountQuery()
public static void setInput(JobConf job, Class<? extends DBWritable> inputClass, String tableName, String conditions, String orderBy, String... fieldNames)
job
- The jobinputClass
- the class object implementing DBWritable, which is the
Java object holding tuple fields.tableName
- The table to read data fromconditions
- The condition which to select data with, eg. '(updated >
20070101 AND length > 0)'orderBy
- the fieldNames in the orderBy clause.fieldNames
- The field names in the tablesetInput(JobConf, Class, String, String)
public static void setInput(JobConf job, Class<? extends DBWritable> inputClass, String inputQuery, String inputCountQuery)
job
- The jobinputClass
- the class object implementing DBWritable, which is the
Java object holding tuple fields.inputQuery
- the input query to select fields. Example :
"SELECT f1, f2, f3 FROM Mytable ORDER BY f1"inputCountQuery
- the input query that returns the number of records in
the table.
Example : "SELECT COUNT(f1) FROM Mytable"setInput(JobConf, Class, String, String, String, String...)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |