Package org.apache.hadoop.mapred
Class KeyValueTextInputFormat
java.lang.Object
org.apache.hadoop.mapred.FileInputFormat<Text,Text>
org.apache.hadoop.mapred.KeyValueTextInputFormat
- All Implemented Interfaces:
InputFormat<Text,,Text> JobConfigurable
@Public
@Stable
public class KeyValueTextInputFormat
extends FileInputFormat<Text,Text>
implements JobConfigurable
An
InputFormat for plain text files. Files are broken into lines.
Either linefeed or carriage-return are used to signal end of line. Each line
is divided into key and value parts by a separator byte. If no such a byte
exists, the key will be the entire line and value will be empty.-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat
FileInputFormat.Counter -
Field Summary
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidInitializes a new instance from aJobConf.getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) Get theRecordReaderfor the givenInputSplit.protected booleanisSplitable(FileSystem fs, Path file) Is the given filename splittable?Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
-
Constructor Details
-
KeyValueTextInputFormat
public KeyValueTextInputFormat()
-
-
Method Details
-
configure
Description copied from interface:JobConfigurableInitializes a new instance from aJobConf.- Specified by:
configurein interfaceJobConfigurable- Parameters:
conf- the configuration
-
isSplitable
Description copied from class:FileInputFormatIs the given filename splittable? Usually, true, but if the file is stream compressed, it will not be. The default implementation inFileInputFormatalways returns true. Implementations that may deal with non-splittable files must override this method.FileInputFormatimplementations can override this and returnfalseto ensure that individual input files are never split-up so thatMappers process entire files.- Overrides:
isSplitablein classFileInputFormat<Text,Text> - Parameters:
fs- the file system that the file is onfile- the file name to check- Returns:
- is this file splitable?
-
getRecordReader
public RecordReader<Text,Text> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException Description copied from interface:InputFormatGet theRecordReaderfor the givenInputSplit.It is the responsibility of the
RecordReaderto respect record boundaries while processing the logical split to present a record-oriented view to the individual task.- Specified by:
getRecordReaderin interfaceInputFormat<Text,Text> - Specified by:
getRecordReaderin classFileInputFormat<Text,Text> - Parameters:
genericSplit- theInputSplitjob- the job that this split belongs to- Returns:
- a
RecordReader - Throws:
IOException
-