org.apache.hadoop.mapreduce.lib.input
Class TextInputFormat

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputFormat<K,V>
      extended by org.apache.hadoop.mapreduce.lib.input.FileInputFormat<LongWritable,Text>
          extended by org.apache.hadoop.mapreduce.lib.input.TextInputFormat

@InterfaceAudience.Public
@InterfaceStability.Stable
public class TextInputFormat
extends FileInputFormat<LongWritable,Text>

An InputFormat for plain text files. Files are broken into lines. Either linefeed or carriage-return are used to signal end of line. Keys are the position in the file, and values are the line of text..


Field Summary
 
Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
 
Constructor Summary
TextInputFormat()
           
 
Method Summary
 RecordReader<LongWritable,Text> createRecordReader(InputSplit split, TaskAttemptContext context)
          Create a record reader for a given split.
protected  boolean isSplitable(JobContext context, Path file)
          Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be.
 
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, listStatus, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextInputFormat

public TextInputFormat()
Method Detail

createRecordReader

public RecordReader<LongWritable,Text> createRecordReader(InputSplit split,
                                                          TaskAttemptContext context)
Description copied from class: InputFormat
Create a record reader for a given split. The framework will call RecordReader.initialize(InputSplit, TaskAttemptContext) before the split is used.

Specified by:
createRecordReader in class InputFormat<LongWritable,Text>
Parameters:
split - the split to be read
context - the information about the task
Returns:
a new record reader

isSplitable

protected boolean isSplitable(JobContext context,
                              Path file)
Description copied from class: FileInputFormat
Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. FileInputFormat implementations can override this and return false to ensure that individual input files are never split-up so that Mappers process entire files.

Overrides:
isSplitable in class FileInputFormat<LongWritable,Text>
Parameters:
context - the job context
file - the file name to check
Returns:
is this file splitable?


Copyright © 2014 Apache Software Foundation. All Rights Reserved.