org.apache.hadoop.io.file.tfile
Class TFile.Reader

java.lang.Object
  extended by org.apache.hadoop.io.file.tfile.TFile.Reader
All Implemented Interfaces:
Closeable
Enclosing class:
TFile

public static class TFile.Reader
extends Object
implements Closeable

TFile Reader. Users may only read TFiles by creating TFile.Reader.Scanner. objects. A scanner may scan the whole TFile (createScanner() ) , a portion of TFile based on byte offsets ( createScannerByByteRange(long, long)), or a portion of TFile with keys fall in a certain key range (for sorted TFile only, createScannerByKey(byte[], byte[]) or createScannerByKey(RawComparable, RawComparable)).


Nested Class Summary
static class TFile.Reader.Scanner
          The TFile Scanner.
 
Constructor Summary
TFile.Reader(FSDataInputStream fsdis, long fileLength, Configuration conf)
          Constructor
 
Method Summary
 void close()
          Close the reader.
 TFile.Reader.Scanner createScanner()
          Get a scanner than can scan the whole TFile.
 TFile.Reader.Scanner createScanner(byte[] beginKey, byte[] endKey)
          Deprecated. Use createScannerByKey(byte[], byte[]) instead.
 TFile.Reader.Scanner createScanner(RawComparable beginKey, RawComparable endKey)
          Deprecated. Use createScannerByKey(RawComparable, RawComparable) instead.
 TFile.Reader.Scanner createScannerByByteRange(long offset, long length)
          Get a scanner that covers a portion of TFile based on byte offsets.
 TFile.Reader.Scanner createScannerByKey(byte[] beginKey, byte[] endKey)
          Get a scanner that covers a portion of TFile based on keys.
 TFile.Reader.Scanner createScannerByKey(RawComparable beginKey, RawComparable endKey)
          Get a scanner that covers a specific key range.
 TFile.Reader.Scanner createScannerByRecordNum(long beginRecNum, long endRecNum)
          Create a scanner that covers a range of records.
 Comparator<RawComparable> getComparator()
          Get an instance of the RawComparator that is constructed based on the string comparator representation.
 String getComparatorName()
          Get the string representation of the comparator.
 Comparator<TFile.Reader.Scanner.Entry> getEntryComparator()
          Get a Comparator object to compare Entries.
 long getEntryCount()
          Get the number of key-value pair entries in TFile.
 RawComparable getFirstKey()
          Get the first key in the TFile.
 RawComparable getKeyNear(long offset)
          Get a sample key that is within a block whose starting offset is greater than or equal to the specified offset.
 RawComparable getLastKey()
          Get the last key in the TFile.
 DataInputStream getMetaBlock(String name)
          Stream access to a meta block.``
 long getRecordNumNear(long offset)
          Get the RecordNum for the first key-value pair in a compressed block whose byte offset in the TFile is greater than or equal to the specified offset.
 boolean isSorted()
          Is the TFile sorted?
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TFile.Reader

public TFile.Reader(FSDataInputStream fsdis,
                    long fileLength,
                    Configuration conf)
             throws IOException
Constructor

Parameters:
fsdis - FS input stream of the TFile.
fileLength - The length of TFile. This is required because we have no easy way of knowing the actual size of the input file through the File input stream.
conf -
Throws:
IOException
Method Detail

close

public void close()
           throws IOException
Close the reader. The state of the Reader object is undefined after close. Calling close() for multiple times has no effect.

Specified by:
close in interface Closeable
Throws:
IOException

getComparatorName

public String getComparatorName()
Get the string representation of the comparator.

Returns:
If the TFile is not sorted by keys, an empty string will be returned. Otherwise, the actual comparator string that is provided during the TFile creation time will be returned.

isSorted

public boolean isSorted()
Is the TFile sorted?

Returns:
true if TFile is sorted.

getEntryCount

public long getEntryCount()
Get the number of key-value pair entries in TFile.

Returns:
the number of key-value pairs in TFile

getFirstKey

public RawComparable getFirstKey()
                          throws IOException
Get the first key in the TFile.

Returns:
The first key in the TFile.
Throws:
IOException

getLastKey

public RawComparable getLastKey()
                         throws IOException
Get the last key in the TFile.

Returns:
The last key in the TFile.
Throws:
IOException

getEntryComparator

public Comparator<TFile.Reader.Scanner.Entry> getEntryComparator()
Get a Comparator object to compare Entries. It is useful when you want stores the entries in a collection (such as PriorityQueue) and perform sorting or comparison among entries based on the keys without copying out the key.

Returns:
An Entry Comparator..

getComparator

public Comparator<RawComparable> getComparator()
Get an instance of the RawComparator that is constructed based on the string comparator representation.

Returns:
a Comparator that can compare RawComparable's.

getMetaBlock

public DataInputStream getMetaBlock(String name)
                             throws IOException,
                                    MetaBlockDoesNotExist
Stream access to a meta block.``

Parameters:
name - The name of the meta block.
Returns:
The input stream.
Throws:
IOException - on I/O error.
MetaBlockDoesNotExist - If the meta block with the name does not exist.

getRecordNumNear

public long getRecordNumNear(long offset)
                      throws IOException
Get the RecordNum for the first key-value pair in a compressed block whose byte offset in the TFile is greater than or equal to the specified offset.

Parameters:
offset - the user supplied offset.
Returns:
the RecordNum to the corresponding entry. If no such entry exists, it returns the total entry count.
Throws:
IOException

getKeyNear

public RawComparable getKeyNear(long offset)
                         throws IOException
Get a sample key that is within a block whose starting offset is greater than or equal to the specified offset.

Parameters:
offset - The file offset.
Returns:
the key that fits the requirement; or null if no such key exists (which could happen if the offset is close to the end of the TFile).
Throws:
IOException

createScanner

public TFile.Reader.Scanner createScanner()
                                   throws IOException
Get a scanner than can scan the whole TFile.

Returns:
The scanner object. A valid Scanner is always returned even if the TFile is empty.
Throws:
IOException

createScannerByByteRange

public TFile.Reader.Scanner createScannerByByteRange(long offset,
                                                     long length)
                                              throws IOException
Get a scanner that covers a portion of TFile based on byte offsets.

Parameters:
offset - The beginning byte offset in the TFile.
length - The length of the region.
Returns:
The actual coverage of the returned scanner tries to match the specified byte-region but always round up to the compression block boundaries. It is possible that the returned scanner contains zero key-value pairs even if length is positive.
Throws:
IOException

createScanner

@Deprecated
public TFile.Reader.Scanner createScanner(byte[] beginKey,
                                                     byte[] endKey)
                                   throws IOException
Deprecated. Use createScannerByKey(byte[], byte[]) instead.

Get a scanner that covers a portion of TFile based on keys.

Parameters:
beginKey - Begin key of the scan (inclusive). If null, scan from the first key-value entry of the TFile.
endKey - End key of the scan (exclusive). If null, scan up to the last key-value entry of the TFile.
Returns:
The actual coverage of the returned scanner will cover all keys greater than or equal to the beginKey and less than the endKey.
Throws:
IOException

createScannerByKey

public TFile.Reader.Scanner createScannerByKey(byte[] beginKey,
                                               byte[] endKey)
                                        throws IOException
Get a scanner that covers a portion of TFile based on keys.

Parameters:
beginKey - Begin key of the scan (inclusive). If null, scan from the first key-value entry of the TFile.
endKey - End key of the scan (exclusive). If null, scan up to the last key-value entry of the TFile.
Returns:
The actual coverage of the returned scanner will cover all keys greater than or equal to the beginKey and less than the endKey.
Throws:
IOException

createScanner

@Deprecated
public TFile.Reader.Scanner createScanner(RawComparable beginKey,
                                                     RawComparable endKey)
                                   throws IOException
Deprecated. Use createScannerByKey(RawComparable, RawComparable) instead.

Get a scanner that covers a specific key range.

Parameters:
beginKey - Begin key of the scan (inclusive). If null, scan from the first key-value entry of the TFile.
endKey - End key of the scan (exclusive). If null, scan up to the last key-value entry of the TFile.
Returns:
The actual coverage of the returned scanner will cover all keys greater than or equal to the beginKey and less than the endKey.
Throws:
IOException

createScannerByKey

public TFile.Reader.Scanner createScannerByKey(RawComparable beginKey,
                                               RawComparable endKey)
                                        throws IOException
Get a scanner that covers a specific key range.

Parameters:
beginKey - Begin key of the scan (inclusive). If null, scan from the first key-value entry of the TFile.
endKey - End key of the scan (exclusive). If null, scan up to the last key-value entry of the TFile.
Returns:
The actual coverage of the returned scanner will cover all keys greater than or equal to the beginKey and less than the endKey.
Throws:
IOException

createScannerByRecordNum

public TFile.Reader.Scanner createScannerByRecordNum(long beginRecNum,
                                                     long endRecNum)
                                              throws IOException
Create a scanner that covers a range of records.

Parameters:
beginRecNum - The RecordNum for the first record (inclusive).
endRecNum - The RecordNum for the last record (exclusive). To scan the whole file, either specify endRecNum==-1 or endRecNum==getEntryCount().
Returns:
The TFile scanner that covers the specified range of records.
Throws:
IOException


Copyright © 2009 The Apache Software Foundation