org.apache.hadoop.streaming
Class StreamBaseRecordReader

java.lang.Object
  extended by org.apache.hadoop.streaming.StreamBaseRecordReader
All Implemented Interfaces:
RecordReader<Text,Text>
Direct Known Subclasses:
StreamXmlRecordReader

public abstract class StreamBaseRecordReader
extends Object
implements RecordReader<Text,Text>

Shared functionality for hadoopStreaming formats. A custom reader can be defined to be a RecordReader with the constructor below and is selected with the option bin/hadoopStreaming -inputreader ...

See Also:
StreamXmlRecordReader

Field Summary
protected static org.apache.commons.logging.Log LOG
           
 
Constructor Summary
StreamBaseRecordReader(FSDataInputStream in, FileSplit split, Reporter reporter, JobConf job, FileSystem fs)
           
 
Method Summary
 void close()
          Close this to future operations.
 Text createKey()
          Create an object of the appropriate type to be used as a key.
 Text createValue()
          Create an object of the appropriate type to be used as a value.
 long getPos()
          Returns the current position in the input.
 float getProgress()
          How much of the input has the RecordReader consumed i.e.
abstract  boolean next(Text key, Text value)
          Read a record.
abstract  void seekNextRecordBoundary()
          Implementation should seek forward in_ to the first byte of the next record.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

protected static final org.apache.commons.logging.Log LOG
Constructor Detail

StreamBaseRecordReader

public StreamBaseRecordReader(FSDataInputStream in,
                              FileSplit split,
                              Reporter reporter,
                              JobConf job,
                              FileSystem fs)
                       throws IOException
Throws:
IOException
Method Detail

next

public abstract boolean next(Text key,
                             Text value)
                      throws IOException
Read a record. Implementation should call numRecStats at the end

Specified by:
next in interface RecordReader<Text,Text>
Parameters:
key - the key to read data into
value - the value to read data into
Returns:
true iff a key/value was read, false if at EOF
Throws:
IOException

getPos

public long getPos()
            throws IOException
Returns the current position in the input.

Specified by:
getPos in interface RecordReader<Text,Text>
Returns:
the current position in the input.
Throws:
IOException

close

public void close()
           throws IOException
Close this to future operations.

Specified by:
close in interface RecordReader<Text,Text>
Throws:
IOException

getProgress

public float getProgress()
                  throws IOException
Description copied from interface: RecordReader
How much of the input has the RecordReader consumed i.e. has been processed by?

Specified by:
getProgress in interface RecordReader<Text,Text>
Returns:
progress from 0.0 to 1.0.
Throws:
IOException

createKey

public Text createKey()
Description copied from interface: RecordReader
Create an object of the appropriate type to be used as a key.

Specified by:
createKey in interface RecordReader<Text,Text>
Returns:
a new key object.

createValue

public Text createValue()
Description copied from interface: RecordReader
Create an object of the appropriate type to be used as a value.

Specified by:
createValue in interface RecordReader<Text,Text>
Returns:
a new value object.

seekNextRecordBoundary

public abstract void seekNextRecordBoundary()
                                     throws IOException
Implementation should seek forward in_ to the first byte of the next record. The initial byte offset in the stream is arbitrary.

Throws:
IOException


Copyright © 2009 The Apache Software Foundation