NLineInputFormat (Hadoop 1.2.1 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.mapred.lib
Class NLineInputFormat

java.lang.Object
  org.apache.hadoop.mapred.FileInputFormat<LongWritable,Text>
      org.apache.hadoop.mapred.lib.NLineInputFormat

All Implemented Interfaces:: InputFormat<LongWritable,Text>, JobConfigurable

public class NLineInputFormat
extends FileInputFormat<LongWritable,Text>
implements JobConfigurable
extends FileInputFormat<LongWritable,Text>
implements JobConfigurable

NLineInputFormat which splits N lines of input as one split. In many "pleasantly" parallel applications, each process/mapper processes the same input file (s), but with computations are controlled by different parameters.(Referred to as "parameter sweeps"). One way to achieve this, is to specify a set of parameters (one set per line) as input in a control file (which is the input path to the map-reduce application, where as the input dataset is specified via a config variable in JobConf.). The NLineInputFormat can be used in such applications, that splits the input file such that by default, one line is fed as a value to one map task, and key is the offset. i.e. (k,v) is (LongWritable, Text). The location hints will span the whole mapred cluster.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat
`FileInputFormat.Counter`

Field Summary

Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
`LOG`

Constructor Summary
`NLineInputFormat()`

Method Summary
`void`	`configure(JobConf conf)` Initializes a new instance from a `JobConf`.
`protected static FileSplit`	`createFileSplit(Path fileName, long begin, long length)` NLineInputFormat uses LineRecordReader, which always reads (and consumes) at least one character out of its upper split boundary.
`RecordReader<LongWritable,Text>`	`getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter)` Get the `RecordReader` for the given `InputSplit`.
`InputSplit[]`	`getSplits(JobConf job, int numSplits)` Logically splits the set of input files for the job, splits N lines of the input as one split.

Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
`addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail