Package org.apache.hadoop.mapreduce.lib.input
package org.apache.hadoop.mapreduce.lib.input
-
ClassDescriptionAn abstract
InputFormatthat returnsCombineFileSplit's inInputFormat.getSplits(JobContext)method.A generic RecordReader that can hand out different recordReaders for each chunk in aCombineFileSplit.A wrapper class for a record reader that handles a single file split.A sub-collection of input files.Input format that is aCombineFileInputFormat-equivalent forSequenceFileInputFormat.Input format that is aCombineFileInputFormat-equivalent forTextInputFormat.org.apache.hadoop.mapreduce.lib.input.CompressedSplitLineReaderLine reader for compressed splits Reading records from a compressed split is tricky, as the LineRecordReader is using the reported compressed input stream position directly to determine when a split has ended.org.apache.hadoop.mapreduce.lib.input.DelegatingInputFormat<K,V> AnInputFormatthat delegates behavior of paths to multiple other InputFormats.org.apache.hadoop.mapreduce.lib.input.DelegatingMapper<K1,V1, K2, V2> AnMapperthat delegates behavior of paths to multiple other mappers.org.apache.hadoop.mapreduce.lib.input.DelegatingRecordReader<K,V> This is a delegating RecordReader, which delegates the functionality to the underlying record reader inTaggedInputSplitFileInputFormat<K,V> A base class for file-basedInputFormats.Deprecated.A section of an input file.FixedLengthInputFormat is an input format used to read input files which contain fixed length records.org.apache.hadoop.mapreduce.lib.input.FixedLengthRecordReaderA reader to read fixed length records from a split.This class wraps a list of problems with the input, so that the user can get a list of problems together instead of finding and fixing them one by one.This class treats a line in the input as a key/value pair separated by a separator character.AnInputFormatfor plain text files.org.apache.hadoop.mapreduce.lib.input.LineRecordReaderTreats keys as offset in file and value as line.This class supports MapReduce jobs that have multiple input paths with a differentInputFormatandMapperfor each pathNLineInputFormat which splits N lines of input as one split.InputFormat reading keys, values from SequenceFiles in binary (raw) format.org.apache.hadoop.mapreduce.lib.input.SequenceFileAsBinaryInputFormat.SequenceFileAsBinaryRecordReaderRead records from a SequenceFile as binary (raw) bytes.This class is similar to SequenceFileInputFormat, except it generates SequenceFileAsTextRecordReader which converts the input keys and values to their String forms by calling toString() method.This class converts the input keys and values to their String forms by calling toString() method.A class that allows a map/red job to work on a sample of sequence files.org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.Filterfilter interfaceorg.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.FilterBasebase class for Filtersorg.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.MD5FilterThis class returns a set of records by examing the MD5 digest of its key against a filtering frequency f.org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.PercentFilterThis class returns a percentage of records The percentage is determined by a filtering frequency f using the criteria record# % f == 0.org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFilter.RegexFilterRecords filter by matching key to regexAnInputFormatforSequenceFiles.AnRecordReaderforSequenceFiles.org.apache.hadoop.mapreduce.lib.input.SplitLineReaderAnInputFormatfor plain text files.org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReaderSplitLineReader for uncompressed files.