Package org.apache.hadoop.mapreduce.lib.input


package org.apache.hadoop.mapreduce.lib.input
  • Class
    Description
    An abstract InputFormat that returns CombineFileSplit's in InputFormat.getSplits(JobContext) method.
    A generic RecordReader that can hand out different recordReaders for each chunk in a CombineFileSplit.
    A wrapper class for a record reader that handles a single file split.
    A sub-collection of input files.
    Input format that is a CombineFileInputFormat-equivalent for SequenceFileInputFormat.
    Input format that is a CombineFileInputFormat-equivalent for TextInputFormat.
    org.apache.hadoop.mapreduce.lib.input.CompressedSplitLineReader
    Line reader for compressed splits Reading records from a compressed split is tricky, as the LineRecordReader is using the reported compressed input stream position directly to determine when a split has ended.
    org.apache.hadoop.mapreduce.lib.input.DelegatingInputFormat<K,V>
    An InputFormat that delegates behavior of paths to multiple other InputFormats.
    org.apache.hadoop.mapreduce.lib.input.DelegatingMapper<K1,V1,K2,V2>
    An Mapper that delegates behavior of paths to multiple other mappers.
    org.apache.hadoop.mapreduce.lib.input.DelegatingRecordReader<K,V>
    This is a delegating RecordReader, which delegates the functionality to the underlying record reader in TaggedInputSplit
    A base class for file-based InputFormats.
    Deprecated.
     
    A section of an input file.
    FixedLengthInputFormat is an input format used to read input files which contain fixed length records.
    org.apache.hadoop.mapreduce.lib.input.FixedLengthRecordReader
    A reader to read fixed length records from a split.
    This class wraps a list of problems with the input, so that the user can get a list of problems together instead of finding and fixing them one by one.
    This class treats a line in the input as a key/value pair separated by a separator character.
    An InputFormat for plain text files.
    org.apache.hadoop.mapreduce.lib.input.LineRecordReader
    Treats keys as offset in file and value as line.
    This class supports MapReduce jobs that have multiple input paths with a different InputFormat and Mapper for each path
    NLineInputFormat which splits N lines of input as one split.
    InputFormat reading keys, values from SequenceFiles in binary (raw) format.
    org.apache.hadoop.mapreduce.lib.input.SequenceFileAsBinaryInputFormat.SequenceFileAsBinaryRecordReader
    Read records from a SequenceFile as binary (raw) bytes.
    This class is similar to SequenceFileInputFormat, except it generates SequenceFileAsTextRecordReader which converts the input keys and values to their String forms by calling toString() method.
    This class converts the input keys and values to their String forms by calling toString() method.
    A class that allows a map/red job to work on a sample of sequence files.
    org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.Filter
    filter interface
    org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.FilterBase
    base class for Filters
    org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.MD5Filter
    This class returns a set of records by examing the MD5 digest of its key against a filtering frequency f.
    org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.PercentFilter
    This class returns a percentage of records The percentage is determined by a filtering frequency f using the criteria record# % f == 0.
    org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.RegexFilter
    Records filter by matching key to regex
    org.apache.hadoop.mapreduce.lib.input.SplitLineReader
     
    An InputFormat for plain text files.
    org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader
    SplitLineReader for uncompressed files.