org.apache.hadoop.mapreduce.lib.input
Class FixedLengthInputFormat

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputFormat<K,V>
      extended by org.apache.hadoop.mapreduce.lib.input.FileInputFormat<LongWritable,BytesWritable>
          extended by org.apache.hadoop.mapreduce.lib.input.FixedLengthInputFormat

@InterfaceAudience.Public
@InterfaceStability.Stable
public class FixedLengthInputFormat
extends FileInputFormat<LongWritable,BytesWritable>

FixedLengthInputFormat is an input format used to read input files which contain fixed length records. The content of a record need not be text. It can be arbitrary binary data. Users must configure the record length property by calling: FixedLengthInputFormat.setRecordLength(conf, recordLength);

or conf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, recordLength);

See Also:
FixedLengthRecordReader

Field Summary
static String FIXED_RECORD_LENGTH
           
 
Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
 
Constructor Summary
FixedLengthInputFormat()
           
 
Method Summary
 RecordReader<LongWritable,BytesWritable> createRecordReader(InputSplit split, TaskAttemptContext context)
          Create a record reader for a given split.
static int getRecordLength(Configuration conf)
          Get record length value
protected  boolean isSplitable(JobContext context, Path file)
          Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be.
static void setRecordLength(Configuration conf, int recordLength)
          Set the length of each record
 
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, listStatus, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

FIXED_RECORD_LENGTH

public static final String FIXED_RECORD_LENGTH
See Also:
Constant Field Values
Constructor Detail

FixedLengthInputFormat

public FixedLengthInputFormat()
Method Detail

setRecordLength

public static void setRecordLength(Configuration conf,
                                   int recordLength)
Set the length of each record

Parameters:
conf - configuration
recordLength - the length of a record

getRecordLength

public static int getRecordLength(Configuration conf)
Get record length value

Parameters:
conf - configuration
Returns:
the record length, zero means none was set

createRecordReader

public RecordReader<LongWritable,BytesWritable> createRecordReader(InputSplit split,
                                                                   TaskAttemptContext context)
                                                            throws IOException,
                                                                   InterruptedException
Description copied from class: InputFormat
Create a record reader for a given split. The framework will call RecordReader.initialize(InputSplit, TaskAttemptContext) before the split is used.

Specified by:
createRecordReader in class InputFormat<LongWritable,BytesWritable>
Parameters:
split - the split to be read
context - the information about the task
Returns:
a new record reader
Throws:
IOException
InterruptedException

isSplitable

protected boolean isSplitable(JobContext context,
                              Path file)
Description copied from class: FileInputFormat
Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. FileInputFormat implementations can override this and return false to ensure that individual input files are never split-up so that Mappers process entire files.

Overrides:
isSplitable in class FileInputFormat<LongWritable,BytesWritable>
Parameters:
context - the job context
file - the file name to check
Returns:
is this file splitable?


Copyright © 2014 Apache Software Foundation. All Rights Reserved.