Class CombineFileSplit
java.lang.Object
org.apache.hadoop.mapreduce.InputSplit
org.apache.hadoop.mapreduce.lib.input.CombineFileSplit
- All Implemented Interfaces:
Writable
- Direct Known Subclasses:
CombineFileSplit
A sub-collection of input files.
Unlike
CombineFileSplit can be used to implement
FileSplit, CombineFileSplit class does not represent
a split of a file, but a split of input files into smaller sets.
A split may contain blocks from different file but all
the blocks in the same split are probably local to some rack CombineFileSplit can be used to implement
RecordReader's,
with reading one record per file.- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptiondefault constructorCombineFileSplit(Path[] files, long[] lengths) CombineFileSplit(Path[] files, long[] start, long[] lengths, String[] locations) Copy constructor -
Method Summary
Modifier and TypeMethodDescriptionlongGet the size of the split, so that the input splits can be sorted by size.longgetLength(int i) Returns the length of the ith Pathlong[]Returns an array containing the lengths of the files in the splitString[]Returns all the Paths where this input-split residesintReturns the number of Paths in the splitlonggetOffset(int i) Returns the start offset of the ith PathgetPath(int i) Returns the ith PathPath[]getPaths()Returns all the Paths in the splitlong[]Returns an array containing the start offsets of the files in the splitvoidreadFields(DataInput in) Deserialize the fields of this object fromin.toString()voidwrite(DataOutput out) Serialize the fields of this object toout.Methods inherited from class org.apache.hadoop.mapreduce.InputSplit
getLocationInfo
-
Constructor Details
-
CombineFileSplit
public CombineFileSplit()default constructor -
CombineFileSplit
-
CombineFileSplit
-
CombineFileSplit
Copy constructor- Throws:
IOException
-
-
Method Details
-
getLength
public long getLength()Description copied from class:InputSplitGet the size of the split, so that the input splits can be sorted by size.- Specified by:
getLengthin classInputSplit- Returns:
- the number of bytes in the split
-
getStartOffsets
public long[] getStartOffsets()Returns an array containing the start offsets of the files in the split -
getLengths
public long[] getLengths()Returns an array containing the lengths of the files in the split -
getOffset
public long getOffset(int i) Returns the start offset of the ith Path -
getLength
public long getLength(int i) Returns the length of the ith Path -
getNumPaths
public int getNumPaths()Returns the number of Paths in the split -
getPath
Returns the ith Path -
getPaths
Returns all the Paths in the split -
getLocations
Returns all the Paths where this input-split resides- Specified by:
getLocationsin classInputSplit- Returns:
- a new array of the node nodes.
- Throws:
IOException
-
readFields
Description copied from interface:WritableDeserialize the fields of this object fromin.For efficiency, implementations should attempt to re-use storage in the existing object where possible.
- Specified by:
readFieldsin interfaceWritable- Parameters:
in-DataInputto deseriablize this object from.- Throws:
IOException- any other problem for readFields.
-
write
Description copied from interface:WritableSerialize the fields of this object toout.- Specified by:
writein interfaceWritable- Parameters:
out-DataOuputto serialize this object into.- Throws:
IOException- any other problem for write.
-
toString
-