org.apache.hadoop.mapred.FileInputFormat<Text,Text>

org.apache.hadoop.mapred.KeyValueTextInputFormat

All Implemented Interfaces:: InputFormat<Text,Text>, JobConfigurable

@Public @Stable public class KeyValueTextInputFormat extends FileInputFormat<Text,Text> implements JobConfigurable

An InputFormat for plain text files. Files are broken into lines. Either linefeed or carriage-return are used to signal end of line. Each line is divided into key and value parts by a separator byte. If no such a byte exists, the key will be the entire line and value will be empty.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat
FileInputFormat.Counter
Field Summary

Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES
Constructor Summary

Constructors

Constructor

Description

KeyValueTextInputFormat()
Method Summary

Modifier and Type

Method

Description

void

configure(JobConf conf)

Initializes a new instance from a JobConf.

RecordReader<Text,Text>

getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter)

Get the RecordReader for the given InputSplit.

protected boolean

isSplitable(FileSystem fs, Path file)

Is the given filename splittable?

Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- KeyValueTextInputFormat
  
  public KeyValueTextInputFormat()
Method Details
- configure
  
  public void configure(JobConf conf)
  
  Description copied from interface: JobConfigurable
  
  Initializes a new instance from a JobConf.
  
  Specified by:
  
  configure in interface JobConfigurable
  
  Parameters:
  
  conf - the configuration
- isSplitable
  
  protected boolean isSplitable(FileSystem fs, Path file)
  
  Description copied from class: FileInputFormat
  
  Is the given filename splittable? Usually, true, but if the file is stream compressed, it will not be. The default implementation in FileInputFormat always returns true. Implementations that may deal with non-splittable files must override this method. FileInputFormat implementations can override this and return false to ensure that individual input files are never split-up so that Mappers process entire files.
  
  Overrides:
  
  isSplitable in class FileInputFormat<Text,Text>
  
  Parameters:
  
  fs - the file system that the file is on
  
  file - the file name to check
  
  Returns:
  
  is this file splitable?
- getRecordReader
  
  public RecordReader<Text,Text> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException
  
  Description copied from interface: InputFormat
  
  Get the RecordReader for the given InputSplit.
  It is the responsibility of the RecordReader to respect record boundaries while processing the logical split to present a record-oriented view to the individual task.
  
  Specified by:
  
  getRecordReader in interface InputFormat<Text,Text>
  
  Specified by:
  
  getRecordReader in class FileInputFormat<Text,Text>
  
  Parameters:
  
  genericSplit - the InputSplit
  
  job - the job that this split belongs to
  
  Returns:
  
  a RecordReader
  
  Throws:
  
  IOException

Class KeyValueTextInputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat

Field Summary

Fields inherited from class org.apache.hadoop.mapred.FileInputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapred.FileInputFormat

Methods inherited from class java.lang.Object

Constructor Details

KeyValueTextInputFormat

Method Details

configure

isSplitable

getRecordReader