NLineInputFormat (Apache Hadoop Main 2.5.2 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.mapred.lib
Class NLineInputFormat

java.lang.Object
  org.apache.hadoop.mapred.FileInputFormat<LongWritable,Text>
      org.apache.hadoop.mapred.lib.NLineInputFormat

All Implemented Interfaces:: InputFormat<LongWritable,Text>, JobConfigurable

@InterfaceAudience.Public @InterfaceStability.Stable public class NLineInputFormat
extends FileInputFormat<LongWritable,Text>
implements JobConfigurable
extends FileInputFormat<LongWritable,Text>
implements JobConfigurable

NLineInputFormat which splits N lines of input as one split. In many "pleasantly" parallel applications, each process/mapper processes the same input file (s), but with computations are controlled by different parameters.(Referred to as "parameter sweeps"). One way to achieve this, is to specify a set of parameters (one set per line) as input in a control file (which is the input path to the map-reduce application, where as the input dataset is specified via a config variable in JobConf.). The NLineInputFormat can be used in such applications, that splits the input file such that by default, one line is fed as a value to one map task, and key is the offset. i.e. (k,v) is (LongWritable, Text). The location hints will span the whole mapred cluster.

Field Summary

Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
`INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES`

Constructor Summary
`NLineInputFormat()`

Method Summary
`void`	`configure(JobConf conf)` Initializes a new instance from a `JobConf`.
`protected static FileSplit`	`createFileSplit(Path fileName, long begin, long length)` NLineInputFormat uses LineRecordReader, which always reads (and consumes) at least one character out of its upper split boundary.
`RecordReader<LongWritable,Text>`	`getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter)` Get the `RecordReader` for the given `InputSplit`.
`InputSplit[]`	`getSplits(JobConf job, int numSplits)` Logically splits the set of input files for the job, splits N lines of the input as one split.

Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
`addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, isSplitable, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail