|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapred.FileInputFormat<LongWritable,Text>
org.apache.hadoop.mapred.lib.NLineInputFormat
@InterfaceAudience.Public @InterfaceStability.Stable public class NLineInputFormat
NLineInputFormat which splits N lines of input as one split. In many "pleasantly" parallel applications, each process/mapper processes the same input file (s), but with computations are controlled by different parameters.(Referred to as "parameter sweeps"). One way to achieve this, is to specify a set of parameters (one set per line) as input in a control file (which is the input path to the map-reduce application, where as the input dataset is specified via a config variable in JobConf.). The NLineInputFormat can be used in such applications, that splits the input file such that by default, one line is fed as a value to one map task, and key is the offset. i.e. (k,v) is (LongWritable, Text). The location hints will span the whole mapred cluster.
| Field Summary |
|---|
| Fields inherited from class org.apache.hadoop.mapred.FileInputFormat |
|---|
INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES |
| Constructor Summary | |
|---|---|
NLineInputFormat()
|
|
| Method Summary | |
|---|---|
void |
configure(JobConf conf)
Initializes a new instance from a JobConf. |
protected static FileSplit |
createFileSplit(Path fileName,
long begin,
long length)
NLineInputFormat uses LineRecordReader, which always reads (and consumes) at least one character out of its upper split boundary. |
RecordReader<LongWritable,Text> |
getRecordReader(InputSplit genericSplit,
JobConf job,
Reporter reporter)
Get the RecordReader for the given InputSplit. |
InputSplit[] |
getSplits(JobConf job,
int numSplits)
Logically splits the set of input files for the job, splits N lines of the input as one split. |
| Methods inherited from class org.apache.hadoop.mapred.FileInputFormat |
|---|
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, isSplitable, listStatus, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public NLineInputFormat()
| Method Detail |
|---|
public RecordReader<LongWritable,Text> getRecordReader(InputSplit genericSplit,
JobConf job,
Reporter reporter)
throws IOException
InputFormatRecordReader for the given InputSplit.
It is the responsibility of the RecordReader to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
getRecordReader in interface InputFormat<LongWritable,Text>getRecordReader in class FileInputFormat<LongWritable,Text>genericSplit - the InputSplitjob - the job that this split belongs to
RecordReader
IOException
public InputSplit[] getSplits(JobConf job,
int numSplits)
throws IOException
getSplits in interface InputFormat<LongWritable,Text>getSplits in class FileInputFormat<LongWritable,Text>job - job configuration.numSplits - the desired number of splits, a hint.
InputSplits for the job.
IOExceptionFileInputFormat.getSplits(JobConf, int)public void configure(JobConf conf)
JobConfigurableJobConf.
configure in interface JobConfigurableconf - the configuration
protected static FileSplit createFileSplit(Path fileName,
long begin,
long length)
fileName - Path of filebegin - the position of the first byte in the file to processlength - number of bytes in InputSplit
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||