org.apache.hadoop.mapred
Class FixedLengthInputFormat

java.lang.Object
  extended by org.apache.hadoop.mapred.FileInputFormat<LongWritable,BytesWritable>
      extended by org.apache.hadoop.mapred.FixedLengthInputFormat
All Implemented Interfaces:
InputFormat<LongWritable,BytesWritable>, JobConfigurable

@InterfaceAudience.Public
@InterfaceStability.Stable
public class FixedLengthInputFormat
extends FileInputFormat<LongWritable,BytesWritable>
implements JobConfigurable

FixedLengthInputFormat is an input format used to read input files which contain fixed length records. The content of a record need not be text. It can be arbitrary binary data. Users must configure the record length property by calling: FixedLengthInputFormat.setRecordLength(conf, recordLength);

or conf.setInt(FixedLengthInputFormat.FIXED_RECORD_LENGTH, recordLength);

See Also:
FixedLengthRecordReader

Field Summary
static String FIXED_RECORD_LENGTH
           
 
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES
 
Constructor Summary
FixedLengthInputFormat()
           
 
Method Summary
 void configure(JobConf conf)
          Initializes a new instance from a JobConf.
static int getRecordLength(Configuration conf)
          Get record length value
 RecordReader<LongWritable,BytesWritable> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter)
          Get the RecordReader for the given InputSplit.
protected  boolean isSplitable(FileSystem fs, Path file)
          Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be.
static void setRecordLength(Configuration conf, int recordLength)
          Set the length of each record
 
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

FIXED_RECORD_LENGTH

public static final String FIXED_RECORD_LENGTH
See Also:
Constant Field Values
Constructor Detail

FixedLengthInputFormat

public FixedLengthInputFormat()
Method Detail

setRecordLength

public static void setRecordLength(Configuration conf,
                                   int recordLength)
Set the length of each record

Parameters:
conf - configuration
recordLength - the length of a record

getRecordLength

public static int getRecordLength(Configuration conf)
Get record length value

Parameters:
conf - configuration
Returns:
the record length, zero means none was set

configure

public void configure(JobConf conf)
Description copied from interface: JobConfigurable
Initializes a new instance from a JobConf.

Specified by:
configure in interface JobConfigurable
Parameters:
conf - the configuration

getRecordReader

public RecordReader<LongWritable,BytesWritable> getRecordReader(InputSplit genericSplit,
                                                                JobConf job,
                                                                Reporter reporter)
                                                         throws IOException
Description copied from interface: InputFormat
Get the RecordReader for the given InputSplit.

It is the responsibility of the RecordReader to respect record boundaries while processing the logical split to present a record-oriented view to the individual task.

Specified by:
getRecordReader in interface InputFormat<LongWritable,BytesWritable>
Specified by:
getRecordReader in class FileInputFormat<LongWritable,BytesWritable>
Parameters:
genericSplit - the InputSplit
job - the job that this split belongs to
Returns:
a RecordReader
Throws:
IOException

isSplitable

protected boolean isSplitable(FileSystem fs,
                              Path file)
Description copied from class: FileInputFormat
Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. FileInputFormat implementations can override this and return false to ensure that individual input files are never split-up so that Mappers process entire files.

Overrides:
isSplitable in class FileInputFormat<LongWritable,BytesWritable>
Parameters:
fs - the file system that the file is on
file - the file name to check
Returns:
is this file splitable?


Copyright © 2014 Apache Software Foundation. All Rights Reserved.