org.apache.hadoop.mapred
Class MultiFileInputFormat<K,V>

java.lang.Object
  extended by org.apache.hadoop.mapred.FileInputFormat<K,V>
      extended by org.apache.hadoop.mapred.MultiFileInputFormat<K,V>
All Implemented Interfaces:
InputFormat<K,V>

@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class MultiFileInputFormat<K,V>
extends FileInputFormat<K,V>

An abstract InputFormat that returns MultiFileSplit's in getSplits(JobConf, int) method. Splits are constructed from the files under the input paths. Each split returned contains nearly equal content length.
Subclasses implement getRecordReader(InputSplit, JobConf, Reporter) to construct RecordReader's for MultiFileSplit's.

See Also:
MultiFileSplit

Field Summary
 
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES
 
Constructor Summary
MultiFileInputFormat()
           
 
Method Summary
abstract  RecordReader<K,V> getRecordReader(InputSplit split, JobConf job, Reporter reporter)
          Get the RecordReader for the given InputSplit.
 InputSplit[] getSplits(JobConf job, int numSplits)
          Splits files returned by FileInputFormat.listStatus(JobConf) when they're too big.
 
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, isSplitable, listStatus, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MultiFileInputFormat

public MultiFileInputFormat()
Method Detail

getSplits

public InputSplit[] getSplits(JobConf job,
                              int numSplits)
                       throws IOException
Description copied from class: FileInputFormat
Splits files returned by FileInputFormat.listStatus(JobConf) when they're too big.

Specified by:
getSplits in interface InputFormat<K,V>
Overrides:
getSplits in class FileInputFormat<K,V>
Parameters:
job - job configuration.
numSplits - the desired number of splits, a hint.
Returns:
an array of InputSplits for the job.
Throws:
IOException

getRecordReader

public abstract RecordReader<K,V> getRecordReader(InputSplit split,
                                                  JobConf job,
                                                  Reporter reporter)
                                           throws IOException
Description copied from interface: InputFormat
Get the RecordReader for the given InputSplit.

It is the responsibility of the RecordReader to respect record boundaries while processing the logical split to present a record-oriented view to the individual task.

Specified by:
getRecordReader in interface InputFormat<K,V>
Specified by:
getRecordReader in class FileInputFormat<K,V>
Parameters:
split - the InputSplit
job - the job that this split belongs to
Returns:
a RecordReader
Throws:
IOException


Copyright © 2014 Apache Software Foundation. All Rights Reserved.