public static class SequenceFile.Sorter extends Object
For best performance, applications should make sure that the Writable.readFields(DataInput)
implementation of their keys is
very efficient. In particular, it should avoid allocating memory.
Modifier and Type | Class and Description |
---|---|
static interface |
SequenceFile.Sorter.RawKeyValueIterator
The interface to iterate over raw keys/values of SequenceFiles.
|
class |
SequenceFile.Sorter.SegmentDescriptor
This class defines a merge segment.
|
Constructor and Description |
---|
SequenceFile.Sorter(FileSystem fs,
Class<? extends WritableComparable> keyClass,
Class valClass,
Configuration conf)
Sort and merge files containing the named classes.
|
SequenceFile.Sorter(FileSystem fs,
RawComparator comparator,
Class keyClass,
Class valClass,
Configuration conf)
Sort and merge using an arbitrary
RawComparator . |
SequenceFile.Sorter(FileSystem fs,
RawComparator comparator,
Class keyClass,
Class valClass,
Configuration conf,
SequenceFile.Metadata metadata)
Sort and merge using an arbitrary
RawComparator . |
Modifier and Type | Method and Description |
---|---|
SequenceFile.Writer |
cloneFileAttributes(Path inputFile,
Path outputFile,
Progressable prog)
Clones the attributes (like compression of the input file and creates a
corresponding Writer
|
int |
getFactor()
Get the number of streams to merge at once.
|
int |
getMemory()
Get the total amount of buffer memory, in bytes.
|
SequenceFile.Sorter.RawKeyValueIterator |
merge(List<SequenceFile.Sorter.SegmentDescriptor> segments,
Path tmpDir)
Merges the list of segments of type
SegmentDescriptor |
SequenceFile.Sorter.RawKeyValueIterator |
merge(Path[] inNames,
boolean deleteInputs,
int factor,
Path tmpDir)
Merges the contents of files passed in Path[]
|
SequenceFile.Sorter.RawKeyValueIterator |
merge(Path[] inNames,
boolean deleteInputs,
Path tmpDir)
Merges the contents of files passed in Path[] using a max factor value
that is already set
|
void |
merge(Path[] inFiles,
Path outFile)
Merge the provided files.
|
SequenceFile.Sorter.RawKeyValueIterator |
merge(Path[] inNames,
Path tempDir,
boolean deleteInputs)
Merges the contents of files passed in Path[]
|
void |
setFactor(int factor)
Set the number of streams to merge at once.
|
void |
setMemory(int memory)
Set the total amount of buffer memory, in bytes.
|
void |
setProgressable(Progressable progressable)
Set the progressable object in order to report progress.
|
void |
sort(Path[] inFiles,
Path outFile,
boolean deleteInput)
Perform a file sort from a set of input files into an output file.
|
void |
sort(Path inFile,
Path outFile)
The backwards compatible interface to sort.
|
SequenceFile.Sorter.RawKeyValueIterator |
sortAndIterate(Path[] inFiles,
Path tempDir,
boolean deleteInput)
Perform a file sort from a set of input files and return an iterator.
|
void |
writeFile(SequenceFile.Sorter.RawKeyValueIterator records,
SequenceFile.Writer writer)
Writes records from RawKeyValueIterator into a file represented by the
passed writer
|
public SequenceFile.Sorter(FileSystem fs, Class<? extends WritableComparable> keyClass, Class valClass, Configuration conf)
public SequenceFile.Sorter(FileSystem fs, RawComparator comparator, Class keyClass, Class valClass, Configuration conf)
RawComparator
.public SequenceFile.Sorter(FileSystem fs, RawComparator comparator, Class keyClass, Class valClass, Configuration conf, SequenceFile.Metadata metadata)
RawComparator
.public void setFactor(int factor)
public int getFactor()
public void setMemory(int memory)
public int getMemory()
public void setProgressable(Progressable progressable)
public void sort(Path[] inFiles, Path outFile, boolean deleteInput) throws IOException
inFiles
- the files to be sortedoutFile
- the sorted output filedeleteInput
- should the input files be deleted as they are read?IOException
public SequenceFile.Sorter.RawKeyValueIterator sortAndIterate(Path[] inFiles, Path tempDir, boolean deleteInput) throws IOException
inFiles
- the files to be sortedtempDir
- the directory where temp files are created during sortdeleteInput
- should the input files be deleted as they are read?IOException
public void sort(Path inFile, Path outFile) throws IOException
inFile
- the input file to sortoutFile
- the sorted output fileIOException
public SequenceFile.Sorter.RawKeyValueIterator merge(List<SequenceFile.Sorter.SegmentDescriptor> segments, Path tmpDir) throws IOException
SegmentDescriptor
segments
- the list of SegmentDescriptorstmpDir
- the directory to write temporary files intoIOException
public SequenceFile.Sorter.RawKeyValueIterator merge(Path[] inNames, boolean deleteInputs, Path tmpDir) throws IOException
inNames
- the array of path namesdeleteInputs
- true if the input files should be deleted when
unnecessarytmpDir
- the directory to write temporary files intoIOException
public SequenceFile.Sorter.RawKeyValueIterator merge(Path[] inNames, boolean deleteInputs, int factor, Path tmpDir) throws IOException
inNames
- the array of path namesdeleteInputs
- true if the input files should be deleted when
unnecessaryfactor
- the factor that will be used as the maximum merge fan-intmpDir
- the directory to write temporary files intoIOException
public SequenceFile.Sorter.RawKeyValueIterator merge(Path[] inNames, Path tempDir, boolean deleteInputs) throws IOException
inNames
- the array of path namestempDir
- the directory for creating temp files during mergedeleteInputs
- true if the input files should be deleted when
unnecessaryIOException
public SequenceFile.Writer cloneFileAttributes(Path inputFile, Path outputFile, Progressable prog) throws IOException
inputFile
- the path of the input file whose attributes should be
clonedoutputFile
- the path of the output fileprog
- the Progressable to report status during the file writeIOException
public void writeFile(SequenceFile.Sorter.RawKeyValueIterator records, SequenceFile.Writer writer) throws IOException
records
- the RawKeyValueIteratorwriter
- the Writer created earlierIOException
public void merge(Path[] inFiles, Path outFile) throws IOException
inFiles
- the array of input path namesoutFile
- the final output fileIOException
Copyright © 2017 Apache Software Foundation. All Rights Reserved.