Class InputSplit

java.lang.Object
org.apache.hadoop.mapreduce.InputSplit
Direct Known Subclasses:
CombineFileSplit, CompositeInputSplit, FileSplit, FileSplit

@Public @Stable public abstract class InputSplit extends Object
InputSplit represents the data to be processed by an individual Mapper.

Typically, it presents a byte-oriented view on the input and is the responsibility of RecordReader of the job to process this and present a record-oriented view.

See Also:
  • Constructor Details

    • InputSplit

      public InputSplit()
  • Method Details

    • getLength

      public abstract long getLength() throws IOException, InterruptedException
      Get the size of the split, so that the input splits can be sorted by size.
      Returns:
      the number of bytes in the split
      Throws:
      IOException
      InterruptedException
    • getLocations

      public abstract String[] getLocations() throws IOException, InterruptedException
      Get the list of nodes by name where the data for the split would be local. The locations do not need to be serialized.
      Returns:
      a new array of the node nodes.
      Throws:
      IOException
      InterruptedException
    • getLocationInfo

      @Evolving public SplitLocationInfo[] getLocationInfo() throws IOException
      Gets info about which nodes the input split is stored on and how it is stored at each location.
      Returns:
      list of SplitLocationInfos describing how the split data is stored at each location. A null value indicates that all the locations have the data stored on disk.
      Throws:
      IOException