|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.hadoop.mapred.FileInputFormat<LongWritable,Text> org.apache.hadoop.mapred.lib.NLineInputFormat
@InterfaceAudience.Public @InterfaceStability.Stable public class NLineInputFormat
NLineInputFormat which splits N lines of input as one split. In many "pleasantly" parallel applications, each process/mapper processes the same input file (s), but with computations are controlled by different parameters.(Referred to as "parameter sweeps"). One way to achieve this, is to specify a set of parameters (one set per line) as input in a control file (which is the input path to the map-reduce application, where as the input dataset is specified via a config variable in JobConf.). The NLineInputFormat can be used in such applications, that splits the input file such that by default, one line is fed as a value to one map task, and key is the offset. i.e. (k,v) is (LongWritable, Text). The location hints will span the whole mapred cluster.
Field Summary |
---|
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat |
---|
INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES |
Constructor Summary | |
---|---|
NLineInputFormat()
|
Method Summary | |
---|---|
void |
configure(JobConf conf)
Initializes a new instance from a JobConf . |
protected static FileSplit |
createFileSplit(Path fileName,
long begin,
long length)
NLineInputFormat uses LineRecordReader, which always reads (and consumes) at least one character out of its upper split boundary. |
RecordReader<LongWritable,Text> |
getRecordReader(InputSplit genericSplit,
JobConf job,
Reporter reporter)
Get the RecordReader for the given InputSplit . |
InputSplit[] |
getSplits(JobConf job,
int numSplits)
Logically splits the set of input files for the job, splits N lines of the input as one split. |
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat |
---|
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, isSplitable, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public NLineInputFormat()
Method Detail |
---|
public RecordReader<LongWritable,Text> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException
InputFormat
RecordReader
for the given InputSplit
.
It is the responsibility of the RecordReader
to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
getRecordReader
in interface InputFormat<LongWritable,Text>
getRecordReader
in class FileInputFormat<LongWritable,Text>
genericSplit
- the InputSplit
job
- the job that this split belongs to
RecordReader
IOException
public InputSplit[] getSplits(JobConf job, int numSplits) throws IOException
getSplits
in interface InputFormat<LongWritable,Text>
getSplits
in class FileInputFormat<LongWritable,Text>
job
- job configuration.numSplits
- the desired number of splits, a hint.
InputSplit
s for the job.
IOException
FileInputFormat.getSplits(JobConf, int)
public void configure(JobConf conf)
JobConfigurable
JobConf
.
configure
in interface JobConfigurable
conf
- the configurationprotected static FileSplit createFileSplit(Path fileName, long begin, long length)
fileName
- Path of filebegin
- the position of the first byte in the file to processlength
- number of bytes in InputSplit
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |