Package org.apache.hadoop.fs
Interface PositionedReadable
- All Known Implementing Classes:
FSDataInputStream,FSInputStream,HdfsDataInputStream
@Public
@Evolving
public interface PositionedReadable
Stream that permits positional reading.
Implementations are required to implement thread-safe operations; this may
be supported by concurrent access to the data, or by using a synchronization
mechanism to serialize access.
Not all implementations meet this requirement. Those that do not cannot
be used as a backing store for some applications, such as Apache HBase.
Independent of whether or not they are thread safe, some implementations
may make the intermediate state of the system, specifically the position
obtained in
Seekable.getPos() visible.-
Method Summary
Modifier and TypeMethodDescriptiondefault intWhat is the largest size that we should group ranges together as?default intWhat is the smallest reasonable seek?intread(long position, byte[] buffer, int offset, int length) Read up to the specified number of bytes, from a given position within a file, and return the number of bytes read.voidreadFully(long position, byte[] buffer) Read number of bytes equal to the length of the buffer, from a given position within a file.voidreadFully(long position, byte[] buffer, int offset, int length) Read the specified number of bytes, from a given position within a file.default voidreadVectored(List<? extends org.apache.hadoop.fs.FileRange> ranges, IntFunction<ByteBuffer> allocate) Read fully a list of file ranges asynchronously from this file.default voidreadVectored(List<? extends org.apache.hadoop.fs.FileRange> ranges, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release) Extension ofreadVectored(List, IntFunction)where arelease(buffer)operation may be invoked if problems surface during reads.
-
Method Details
-
read
Read up to the specified number of bytes, from a given position within a file, and return the number of bytes read. This does not change the current offset of a file, and is thread-safe. Warning: Not all filesystems satisfy the thread-safety requirement.- Parameters:
position- position within filebuffer- destination bufferoffset- offset in the bufferlength- number of bytes to read- Returns:
- actual number of bytes read; -1 means "none"
- Throws:
IOException- IO problems.
-
readFully
Read the specified number of bytes, from a given position within a file. This does not change the current offset of a file, and is thread-safe. Warning: Not all filesystems satisfy the thread-safety requirement.- Parameters:
position- position within filebuffer- destination bufferoffset- offset in the bufferlength- number of bytes to read- Throws:
IOException- IO problems.EOFException- the end of the data was reached before the read operation completed
-
readFully
Read number of bytes equal to the length of the buffer, from a given position within a file. This does not change the current offset of a file, and is thread-safe. Warning: Not all filesystems satisfy the thread-safety requirement.- Parameters:
position- position within filebuffer- destination buffer- Throws:
IOException- IO problems.EOFException- the end of the data was reached before the read operation completed
-
minSeekForVectorReads
default int minSeekForVectorReads()What is the smallest reasonable seek?- Returns:
- the minimum number of bytes
-
maxReadSizeForVectorReads
default int maxReadSizeForVectorReads()What is the largest size that we should group ranges together as?- Returns:
- the number of bytes to read at once
-
readVectored
default void readVectored(List<? extends org.apache.hadoop.fs.FileRange> ranges, IntFunction<ByteBuffer> allocate) throws IOException Read fully a list of file ranges asynchronously from this file. The default iterates through the ranges to read each synchronously, but the intent is that FSDataInputStream subclasses can make more efficient readers. As a result of the call, each range will have FileRange.setData(CompletableFuture) called with a future that when complete will have a ByteBuffer with the data from the file's range.The position returned by getPos() after readVectored() is undefined.
If a file is changed while the readVectored() operation is in progress, the output is undefined. Some ranges may have old data, some may have new and some may have both.
While a readVectored() operation is in progress, normal read api calls may block.
- Parameters:
ranges- the byte ranges to readallocate- the function to allocate ByteBuffer- Throws:
IOException- any IOE.IllegalArgumentException- if the any of ranges are invalid, or they overlap.
-
readVectored
default void readVectored(List<? extends org.apache.hadoop.fs.FileRange> ranges, IntFunction<ByteBuffer> allocate, Consumer<ByteBuffer> release) throws IOException Extension ofreadVectored(List, IntFunction)where arelease(buffer)operation may be invoked if problems surface during reads.The
releaseoperation is invoked after an IOException to return the actively buffer to a pool before reporting a failure in the future.The default implementation calls
readVectored(List, IntFunction).pImplementations SHOULD override this method if they can release buffers as part of their error handling.
- Parameters:
ranges- the byte ranges to readallocate- function to allocate ByteBufferrelease- callable to release a ByteBuffer.- Throws:
IOException- any IOE.IllegalArgumentException- if any of ranges are invalid, or they overlap.NullPointerException- null arguments.
-