Package org.apache.hadoop.mapred
Class FileSplit
java.lang.Object
org.apache.hadoop.mapreduce.InputSplit
org.apache.hadoop.mapred.FileSplit
- All Implemented Interfaces:
Writable,InputSplit,InputSplitWithLocationInfo
A section of an input file. Returned by
InputFormat.getSplits(JobConf, int) and passed to
InputFormat.getRecordReader(InputSplit,JobConf,Reporter).-
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedConstructs a split with host informationConstructs a split with host informationDeprecated. -
Method Summary
Modifier and TypeMethodDescriptionlongThe number of bytes in the file to process.Gets info about which nodes the input split is stored on and how it is stored at each location.String[]Get the list of nodes by name where the data for the split would be local.getPath()The file containing this split's data.longgetStart()The position of the first byte in the file to process.voidreadFields(DataInput in) Deserialize the fields of this object fromin.toString()voidwrite(DataOutput out) Serialize the fields of this object toout.
-
Constructor Details
-
FileSplit
protected FileSplit() -
FileSplit
Deprecated.Constructs a split.- Parameters:
file- the file namestart- the position of the first byte in the file to processlength- the number of bytes in the file to process
-
FileSplit
Constructs a split with host information- Parameters:
file- the file namestart- the position of the first byte in the file to processlength- the number of bytes in the file to processhosts- the list of hosts containing the block, possibly null
-
FileSplit
Constructs a split with host information- Parameters:
file- the file namestart- the position of the first byte in the file to processlength- the number of bytes in the file to processhosts- the list of hosts containing the block, possibly nullinMemoryHosts- the list of hosts containing the block in memory
-
FileSplit
-
-
Method Details
-
getPath
The file containing this split's data. -
getStart
public long getStart()The position of the first byte in the file to process. -
getLength
public long getLength()The number of bytes in the file to process.- Specified by:
getLengthin interfaceInputSplit- Specified by:
getLengthin classInputSplit- Returns:
- the number of bytes in the split
-
toString
-
write
Description copied from interface:WritableSerialize the fields of this object toout.- Specified by:
writein interfaceWritable- Parameters:
out-DataOuputto serialize this object into.- Throws:
IOException- any other problem for write.
-
readFields
Description copied from interface:WritableDeserialize the fields of this object fromin.For efficiency, implementations should attempt to re-use storage in the existing object where possible.
- Specified by:
readFieldsin interfaceWritable- Parameters:
in-DataInputto deseriablize this object from.- Throws:
IOException- any other problem for readFields.
-
getLocations
Description copied from class:InputSplitGet the list of nodes by name where the data for the split would be local. The locations do not need to be serialized.- Specified by:
getLocationsin interfaceInputSplit- Specified by:
getLocationsin classInputSplit- Returns:
- a new array of the node nodes.
- Throws:
IOException
-
getLocationInfo
Description copied from class:InputSplitGets info about which nodes the input split is stored on and how it is stored at each location.- Specified by:
getLocationInfoin interfaceInputSplitWithLocationInfo- Overrides:
getLocationInfoin classInputSplit- Returns:
- list of
SplitLocationInfos describing how the split data is stored at each location. A null value indicates that all the locations have the data stored on disk. - Throws:
IOException
-