org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat

@Public @Stable public class KeyValueTextInputFormat extends FileInputFormat<Text,Text>

An InputFormat for plain text files. Files are broken into lines. Either line feed or carriage-return are used to signal end of line. Each line is divided into key and value parts by a separator byte. If no such a byte exists, the key will be the entire line and value will be empty. The separator byte can be specified in config file under the attribute name mapreduce.input.keyvaluelinerecordreader.key.value.separator. The default is the tab character ('\t').

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
FileInputFormat.Counter
Field Summary

Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
Constructor Summary

Constructors

Constructor

Description

KeyValueTextInputFormat()
Method Summary

Modifier and Type

Method

Description

RecordReader<Text,Text>

createRecordReader(InputSplit genericSplit, TaskAttemptContext context)

Create a record reader for a given split.

protected boolean

isSplitable(JobContext context, Path file)

Is the given filename splittable?

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, listStatus, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize, shrinkStatus

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- KeyValueTextInputFormat
  
  public KeyValueTextInputFormat()
Method Details
- isSplitable
  
  protected boolean isSplitable(JobContext context, Path file)
  
  Description copied from class: FileInputFormat
  
  Is the given filename splittable? Usually, true, but if the file is stream compressed, it will not be. The default implementation in FileInputFormat always returns true. Implementations that may deal with non-splittable files must override this method. FileInputFormat implementations can override this and return false to ensure that individual input files are never split-up so that Mappers process entire files.
  
  Overrides:
  
  isSplitable in class FileInputFormat<Text,Text>
  
  Parameters:
  
  context - the job context
  
  file - the file name to check
  
  Returns:
  
  is this file splitable?
- createRecordReader
  
  public RecordReader<Text,Text> createRecordReader(InputSplit genericSplit, TaskAttemptContext context) throws IOException
  
  Description copied from class: InputFormat
  
  Create a record reader for a given split. The framework will call RecordReader.initialize(InputSplit, TaskAttemptContext) before the split is used.
  
  Specified by:
  
  createRecordReader in class InputFormat<Text,Text>
  
  Parameters:
  
  genericSplit - the split to be read
  
  context - the information about the task
  
  Returns:
  
  a new record reader
  
  Throws:
  
  IOException

Class KeyValueTextInputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Field Summary

Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Methods inherited from class java.lang.Object

Constructor Details

KeyValueTextInputFormat

Method Details

isSplitable

createRecordReader