org.apache.hadoop.mapreduce
Class InputSplit

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputSplit
Direct Known Subclasses:
CombineFileSplit, CompositeInputSplit, FileSplit, FileSplit

@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class InputSplit
extends Object

InputSplit represents the data to be processed by an individual Mapper.

Typically, it presents a byte-oriented view on the input and is the responsibility of RecordReader of the job to process this and present a record-oriented view.

See Also:
InputFormat, RecordReader

Constructor Summary
InputSplit()
           
 
Method Summary
abstract  long getLength()
          Get the size of the split, so that the input splits can be sorted by size.
 SplitLocationInfo[] getLocationInfo()
          Gets info about which nodes the input split is stored on and how it is stored at each location.
abstract  String[] getLocations()
          Get the list of nodes by name where the data for the split would be local.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

InputSplit

public InputSplit()
Method Detail

getLength

public abstract long getLength()
                        throws IOException,
                               InterruptedException
Get the size of the split, so that the input splits can be sorted by size.

Returns:
the number of bytes in the split
Throws:
IOException
InterruptedException

getLocations

public abstract String[] getLocations()
                               throws IOException,
                                      InterruptedException
Get the list of nodes by name where the data for the split would be local. The locations do not need to be serialized.

Returns:
a new array of the node nodes.
Throws:
IOException
InterruptedException

getLocationInfo

@InterfaceStability.Evolving
public SplitLocationInfo[] getLocationInfo()
                                    throws IOException
Gets info about which nodes the input split is stored on and how it is stored at each location.

Returns:
list of SplitLocationInfos describing how the split data is stored at each location. A null value indicates that all the locations have the data stored on disk.
Throws:
IOException


Copyright © 2014 Apache Software Foundation. All Rights Reserved.