org.apache.hadoop.mapred
Class FileSplit

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputSplit
      extended by org.apache.hadoop.mapred.FileSplit
All Implemented Interfaces:
Writable, InputSplit, InputSplitWithLocationInfo

@InterfaceAudience.Public
@InterfaceStability.Stable
public class FileSplit
extends InputSplit
implements InputSplitWithLocationInfo

A section of an input file. Returned by InputFormat.getSplits(JobConf, int) and passed to InputFormat.getRecordReader(InputSplit,JobConf,Reporter).


Constructor Summary
protected FileSplit()
           
  FileSplit(FileSplit fs)
           
  FileSplit(Path file, long start, long length, JobConf conf)
          Deprecated.  
  FileSplit(Path file, long start, long length, String[] hosts)
          Constructs a split with host information
  FileSplit(Path file, long start, long length, String[] hosts, String[] inMemoryHosts)
          Constructs a split with host information
 
Method Summary
 long getLength()
          The number of bytes in the file to process.
 SplitLocationInfo[] getLocationInfo()
          Gets info about which nodes the input split is stored on and how it is stored at each location.
 String[] getLocations()
          Get the list of nodes by name where the data for the split would be local.
 Path getPath()
          The file containing this split's data.
 long getStart()
          The position of the first byte in the file to process.
 void readFields(DataInput in)
          Deserialize the fields of this object from in.
 String toString()
           
 void write(DataOutput out)
          Serialize the fields of this object to out.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

FileSplit

protected FileSplit()

FileSplit

@Deprecated
public FileSplit(Path file,
                            long start,
                            long length,
                            JobConf conf)
Deprecated. 

Constructs a split.

Parameters:
file - the file name
start - the position of the first byte in the file to process
length - the number of bytes in the file to process

FileSplit

public FileSplit(Path file,
                 long start,
                 long length,
                 String[] hosts)
Constructs a split with host information

Parameters:
file - the file name
start - the position of the first byte in the file to process
length - the number of bytes in the file to process
hosts - the list of hosts containing the block, possibly null

FileSplit

public FileSplit(Path file,
                 long start,
                 long length,
                 String[] hosts,
                 String[] inMemoryHosts)
Constructs a split with host information

Parameters:
file - the file name
start - the position of the first byte in the file to process
length - the number of bytes in the file to process
hosts - the list of hosts containing the block, possibly null
inMemoryHosts - the list of hosts containing the block in memory

FileSplit

public FileSplit(FileSplit fs)
Method Detail

getPath

public Path getPath()
The file containing this split's data.


getStart

public long getStart()
The position of the first byte in the file to process.


getLength

public long getLength()
The number of bytes in the file to process.

Specified by:
getLength in interface InputSplit
Specified by:
getLength in class InputSplit
Returns:
the number of bytes in the split

toString

public String toString()
Overrides:
toString in class Object

write

public void write(DataOutput out)
           throws IOException
Description copied from interface: Writable
Serialize the fields of this object to out.

Specified by:
write in interface Writable
Parameters:
out - DataOuput to serialize this object into.
Throws:
IOException

readFields

public void readFields(DataInput in)
                throws IOException
Description copied from interface: Writable
Deserialize the fields of this object from in.

For efficiency, implementations should attempt to re-use storage in the existing object where possible.

Specified by:
readFields in interface Writable
Parameters:
in - DataInput to deseriablize this object from.
Throws:
IOException

getLocations

public String[] getLocations()
                      throws IOException
Description copied from class: InputSplit
Get the list of nodes by name where the data for the split would be local. The locations do not need to be serialized.

Specified by:
getLocations in interface InputSplit
Specified by:
getLocations in class InputSplit
Returns:
a new array of the node nodes.
Throws:
IOException

getLocationInfo

@InterfaceStability.Evolving
public SplitLocationInfo[] getLocationInfo()
                                    throws IOException
Description copied from class: InputSplit
Gets info about which nodes the input split is stored on and how it is stored at each location.

Specified by:
getLocationInfo in interface InputSplitWithLocationInfo
Overrides:
getLocationInfo in class InputSplit
Returns:
list of SplitLocationInfos describing how the split data is stored at each location. A null value indicates that all the locations have the data stored on disk.
Throws:
IOException


Copyright © 2014 Apache Software Foundation. All Rights Reserved.