Class KeyValueTextInputFormat
java.lang.Object
org.apache.hadoop.mapreduce.InputFormat<K,V>
org.apache.hadoop.mapreduce.lib.input.FileInputFormat<Text,Text>
org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat
An
InputFormat for plain text files. Files are broken into lines.
Either line feed or carriage-return are used to signal end of line.
Each line is divided into key and value parts by a separator byte. If no
such a byte exists, the key will be the entire line and value will be empty.
The separator byte can be specified in config file under the attribute name
mapreduce.input.keyvaluelinerecordreader.key.value.separator. The default
is the tab character ('\t').-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
FileInputFormat.Counter -
Field Summary
Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncreateRecordReader(InputSplit genericSplit, TaskAttemptContext context) Create a record reader for a given split.protected booleanisSplitable(JobContext context, Path file) Is the given filename splittable?Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, listStatus, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize, shrinkStatus
-
Constructor Details
-
KeyValueTextInputFormat
public KeyValueTextInputFormat()
-
-
Method Details
-
isSplitable
Description copied from class:FileInputFormatIs the given filename splittable? Usually, true, but if the file is stream compressed, it will not be. The default implementation inFileInputFormatalways returns true. Implementations that may deal with non-splittable files must override this method.FileInputFormatimplementations can override this and returnfalseto ensure that individual input files are never split-up so thatMappers process entire files.- Overrides:
isSplitablein classFileInputFormat<Text,Text> - Parameters:
context- the job contextfile- the file name to check- Returns:
- is this file splitable?
-
createRecordReader
public RecordReader<Text,Text> createRecordReader(InputSplit genericSplit, TaskAttemptContext context) throws IOException Description copied from class:InputFormatCreate a record reader for a given split. The framework will callRecordReader.initialize(InputSplit, TaskAttemptContext)before the split is used.- Specified by:
createRecordReaderin classInputFormat<Text,Text> - Parameters:
genericSplit- the split to be readcontext- the information about the task- Returns:
- a new record reader
- Throws:
IOException
-