Interface PositionedReadable

All Known Implementing Classes:
FSDataInputStream, FSInputStream, HdfsDataInputStream

@Public @Evolving public interface PositionedReadable
Stream that permits positional reading. Implementations are required to implement thread-safe operations; this may be supported by concurrent access to the data, or by using a synchronization mechanism to serialize access. Not all implementations meet this requirement. Those that do not cannot be used as a backing store for some applications, such as Apache HBase. Independent of whether or not they are thread safe, some implementations may make the intermediate state of the system, specifically the position obtained in Seekable.getPos() visible.
  • Method Summary

    Modifier and Type
    Method
    Description
    default int
    What is the largest size that we should group ranges together as?
    default int
    What is the smallest reasonable seek?
    int
    read(long position, byte[] buffer, int offset, int length)
    Read up to the specified number of bytes, from a given position within a file, and return the number of bytes read.
    void
    readFully(long position, byte[] buffer)
    Read number of bytes equal to the length of the buffer, from a given position within a file.
    void
    readFully(long position, byte[] buffer, int offset, int length)
    Read the specified number of bytes, from a given position within a file.
    default void
    readVectored(List<? extends org.apache.hadoop.fs.FileRange> ranges, IntFunction<ByteBuffer> allocate)
    Read fully a list of file ranges asynchronously from this file.
    default void
    readVectored(List<? extends org.apache.hadoop.fs.FileRange> ranges, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release)
    Extension of readVectored(List, IntFunction) where a release(buffer) operation may be invoked if problems surface during reads.
  • Method Details

    • read

      int read(long position, byte[] buffer, int offset, int length) throws IOException
      Read up to the specified number of bytes, from a given position within a file, and return the number of bytes read. This does not change the current offset of a file, and is thread-safe. Warning: Not all filesystems satisfy the thread-safety requirement.
      Parameters:
      position - position within file
      buffer - destination buffer
      offset - offset in the buffer
      length - number of bytes to read
      Returns:
      actual number of bytes read; -1 means "none"
      Throws:
      IOException - IO problems.
    • readFully

      void readFully(long position, byte[] buffer, int offset, int length) throws IOException
      Read the specified number of bytes, from a given position within a file. This does not change the current offset of a file, and is thread-safe. Warning: Not all filesystems satisfy the thread-safety requirement.
      Parameters:
      position - position within file
      buffer - destination buffer
      offset - offset in the buffer
      length - number of bytes to read
      Throws:
      IOException - IO problems.
      EOFException - the end of the data was reached before the read operation completed
    • readFully

      void readFully(long position, byte[] buffer) throws IOException
      Read number of bytes equal to the length of the buffer, from a given position within a file. This does not change the current offset of a file, and is thread-safe. Warning: Not all filesystems satisfy the thread-safety requirement.
      Parameters:
      position - position within file
      buffer - destination buffer
      Throws:
      IOException - IO problems.
      EOFException - the end of the data was reached before the read operation completed
    • minSeekForVectorReads

      default int minSeekForVectorReads()
      What is the smallest reasonable seek?
      Returns:
      the minimum number of bytes
    • maxReadSizeForVectorReads

      default int maxReadSizeForVectorReads()
      What is the largest size that we should group ranges together as?
      Returns:
      the number of bytes to read at once
    • readVectored

      default void readVectored(List<? extends org.apache.hadoop.fs.FileRange> ranges, IntFunction<ByteBuffer> allocate) throws IOException
      Read fully a list of file ranges asynchronously from this file. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers. As a result of the call, each range will have FileRange.setData(CompletableFuture) called with a future that when complete will have a ByteBuffer with the data from the file's range.

      The position returned by getPos() after readVectored() is undefined.

      If a file is changed while the readVectored() operation is in progress, the output is undefined. Some ranges may have old data, some may have new and some may have both.

      While a readVectored() operation is in progress, normal read api calls may block.

      Parameters:
      ranges - the byte ranges to read
      allocate - the function to allocate ByteBuffer
      Throws:
      IOException - any IOE.
      IllegalArgumentException - if the any of ranges are invalid, or they overlap.
    • readVectored

      default void readVectored(List<? extends org.apache.hadoop.fs.FileRange> ranges, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release) throws IOException
      Extension of readVectored(List, IntFunction) where a release(buffer) operation may be invoked if problems surface during reads.

      The release operation is invoked after an IOException to return the actively buffer to a pool before reporting a failure in the future.

      The default implementation calls readVectored(List, IntFunction).p

      Implementations SHOULD override this method if they can release buffers as part of their error handling.

      Parameters:
      ranges - the byte ranges to read
      allocate - function to allocate ByteBuffer
      release - callable to release a ByteBuffer.
      Throws:
      IOException - any IOE.
      IllegalArgumentException - if any of ranges are invalid, or they overlap.
      NullPointerException - null arguments.