@InterfaceAudience.Public @InterfaceStability.Stable public class TextInputFormat extends FileInputFormat<LongWritable,Text> implements JobConfigurable
InputFormat for plain text files. Files are broken into lines.
Either linefeed or carriage-return are used to signal end of line. Keys are
the position in the file, and values are the line of text..INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES| Constructor and Description |
|---|
TextInputFormat() |
| Modifier and Type | Method and Description |
|---|---|
void |
configure(JobConf conf)
Initializes a new instance from a
JobConf. |
RecordReader<LongWritable,Text> |
getRecordReader(InputSplit genericSplit,
JobConf job,
Reporter reporter)
Get the
RecordReader for the given InputSplit. |
protected boolean |
isSplitable(FileSystem fs,
Path file)
Is the given filename splitable? Usually, true, but if the file is
stream compressed, it will not be.
|
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSizepublic TextInputFormat()
public void configure(JobConf conf)
JobConfigurableJobConf.configure in interface JobConfigurableconf - the configurationprotected boolean isSplitable(FileSystem fs, Path file)
FileInputFormatFileInputFormat implementations can override this and return
false to ensure that individual input files are never split-up
so that Mappers process entire files.isSplitable in class FileInputFormat<LongWritable,Text>fs - the file system that the file is onfile - the file name to checkpublic RecordReader<LongWritable,Text> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException
InputFormatRecordReader for the given InputSplit.
It is the responsibility of the RecordReader to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
getRecordReader in interface InputFormat<LongWritable,Text>getRecordReader in class FileInputFormat<LongWritable,Text>genericSplit - the InputSplitjob - the job that this split belongs toRecordReaderIOExceptionCopyright © 2018 Apache Software Foundation. All rights reserved.