org.apache.hadoop.mapreduce.lib.input (Apache Hadoop Main 3.5.0 API)

package org.apache.hadoop.mapreduce.lib.input

Class

Description

CombineFileInputFormat<K,V>

An abstract InputFormat that returns CombineFileSplit's in InputFormat.getSplits(JobContext) method.

CombineFileRecordReader<K,V>

A generic RecordReader that can hand out different recordReaders for each chunk in a CombineFileSplit.

CombineFileRecordReaderWrapper<K,V>

A wrapper class for a record reader that handles a single file split.

CombineFileSplit

A sub-collection of input files.

CombineSequenceFileInputFormat<K,V>

Input format that is a CombineFileInputFormat-equivalent for SequenceFileInputFormat.

CombineTextInputFormat

Input format that is a CombineFileInputFormat-equivalent for TextInputFormat.

org.apache.hadoop.mapreduce.lib.input.CompressedSplitLineReader

Line reader for compressed splits Reading records from a compressed split is tricky, as the LineRecordReader is using the reported compressed input stream position directly to determine when a split has ended.

org.apache.hadoop.mapreduce.lib.input.DelegatingInputFormat<K,V>

An InputFormat that delegates behavior of paths to multiple other InputFormats.

org.apache.hadoop.mapreduce.lib.input.DelegatingMapper<K1,V1,K2,V2>

An Mapper that delegates behavior of paths to multiple other mappers.

org.apache.hadoop.mapreduce.lib.input.DelegatingRecordReader<K,V>

This is a delegating RecordReader, which delegates the functionality to the underlying record reader in TaggedInputSplit

FileInputFormat<K,V>

A base class for file-based InputFormats.

FileInputFormat.Counter

Deprecated.

FileInputFormatCounter

FileSplit

A section of an input file.

FixedLengthInputFormat

FixedLengthInputFormat is an input format used to read input files which contain fixed length records.

org.apache.hadoop.mapreduce.lib.input.FixedLengthRecordReader

A reader to read fixed length records from a split.

InvalidInputException

This class wraps a list of problems with the input, so that the user can get a list of problems together instead of finding and fixing them one by one.

KeyValueLineRecordReader

This class treats a line in the input as a key/value pair separated by a separator character.

KeyValueTextInputFormat

An InputFormat for plain text files.

org.apache.hadoop.mapreduce.lib.input.LineRecordReader

Treats keys as offset in file and value as line.

MultipleInputs

This class supports MapReduce jobs that have multiple input paths with a different InputFormat and Mapper for each path

NLineInputFormat

NLineInputFormat which splits N lines of input as one split.

SequenceFileAsBinaryInputFormat

InputFormat reading keys, values from SequenceFiles in binary (raw) format.

org.apache.hadoop.mapreduce.lib.input.SequenceFileAsBinaryInputFormat.SequenceFileAsBinaryRecordReader

Read records from a SequenceFile as binary (raw) bytes.

SequenceFileAsTextInputFormat

This class is similar to SequenceFileInputFormat, except it generates SequenceFileAsTextRecordReader which converts the input keys and values to their String forms by calling toString() method.

SequenceFileAsTextRecordReader

This class converts the input keys and values to their String forms by calling toString() method.

SequenceFileInputFilter<K,V>

A class that allows a map/red job to work on a sample of sequence files.

org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.Filter

filter interface

org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.FilterBase

base class for Filters

org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.MD5Filter

This class returns a set of records by examing the MD5 digest of its key against a filtering frequency f.

org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.PercentFilter

This class returns a percentage of records The percentage is determined by a filtering frequency f using the criteria record# % f == 0.

org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.RegexFilter

Records filter by matching key to regex

SequenceFileInputFormat<K,V>

An InputFormat for SequenceFiles.

SequenceFileRecordReader<K,V>

An RecordReader for SequenceFiles.

org.apache.hadoop.mapreduce.lib.input.SplitLineReader

TextInputFormat

An InputFormat for plain text files.

org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader

SplitLineReader for uncompressed files.

Package org.apache.hadoop.mapreduce.lib.input