KeyValueTextInputFormat (Hadoop 1.2.1 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.mapreduce.lib.input
Class KeyValueTextInputFormat

java.lang.Object
  org.apache.hadoop.mapreduce.InputFormat<K,V>
      org.apache.hadoop.mapreduce.lib.input.FileInputFormat<Text,Text>
          org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat

@InterfaceAudience.Public @InterfaceStability.Stable public class KeyValueTextInputFormat
extends FileInputFormat<Text,Text>
extends FileInputFormat<Text,Text>

An InputFormat for plain text files. Files are broken into lines. Either line feed or carriage-return are used to signal end of line. Each line is divided into key and value parts by a separator byte. If no such a byte exists, the key will be the entire line and value will be empty.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
`FileInputFormat.Counter`

Constructor Summary
`KeyValueTextInputFormat()`

Method Summary
`RecordReader<Text,Text>`	`createRecordReader(InputSplit genericSplit, TaskAttemptContext context)` Create a record reader for a given split.
`protected boolean`	`isSplitable(JobContext context, Path file)` Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be.

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
`addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

KeyValueTextInputFormat

public KeyValueTextInputFormat()

Method Detail

isSplitable

protected boolean isSplitable(JobContext context,
                              Path file)

Description copied from class: FileInputFormat

Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. FileInputFormat implementations can override this and return false to ensure that individual input files are never split-up so that Mappers process entire files.

Overrides:: isSplitable in class FileInputFormat<Text,Text>

Parameters:: context - the job context; file - the file name to check
Returns:: is this file splitable?

createRecordReader

public RecordReader<Text,Text> createRecordReader(InputSplit genericSplit,
                                                  TaskAttemptContext context)
                                           throws IOException

Description copied from class: InputFormat

Create a record reader for a given split. The framework will call RecordReader.initialize(InputSplit, TaskAttemptContext) before the split is used.

Specified by:: createRecordReader in class InputFormat<Text,Text>

Parameters:: genericSplit - the split to be read; context - the information about the task
Returns:: a new record reader
Throws:: IOException