org.apache.hadoop.io.compress
Interface Compressor

All Known Implementing Classes:
BuiltInZlibDeflater, BZip2DummyCompressor, SnappyCompressor, ZlibCompressor

public interface Compressor

Specification of a stream-based 'compressor' which can be plugged into a CompressionOutputStream to compress data. This is modelled after Deflater


Method Summary
 int compress(byte[] b, int off, int len)
          Fills specified buffer with compressed data.
 void end()
          Closes the compressor and discards any unprocessed input.
 void finish()
          When called, indicates that compression should end with the current contents of the input buffer.
 boolean finished()
          Returns true if the end of the compressed data output stream has been reached.
 long getBytesRead()
          Return number of uncompressed bytes input so far.
 long getBytesWritten()
          Return number of compressed bytes output so far.
 boolean needsInput()
          Returns true if the input data buffer is empty and #setInput() should be called to provide more input.
 void reinit(Configuration conf)
          Prepare the compressor to be used in a new stream with settings defined in the given Configuration
 void reset()
          Resets compressor so that a new set of input data can be processed.
 void setDictionary(byte[] b, int off, int len)
          Sets preset dictionary for compression.
 void setInput(byte[] b, int off, int len)
          Sets input data for compression.
 

Method Detail

setInput

void setInput(byte[] b,
              int off,
              int len)
Sets input data for compression. This should be called whenever #needsInput() returns true indicating that more input data is required.

Parameters:
b - Input data
off - Start offset
len - Length

needsInput

boolean needsInput()
Returns true if the input data buffer is empty and #setInput() should be called to provide more input.

Returns:
true if the input data buffer is empty and #setInput() should be called in order to provide more input.

setDictionary

void setDictionary(byte[] b,
                   int off,
                   int len)
Sets preset dictionary for compression. A preset dictionary is used when the history buffer can be predetermined.

Parameters:
b - Dictionary data bytes
off - Start offset
len - Length

getBytesRead

long getBytesRead()
Return number of uncompressed bytes input so far.


getBytesWritten

long getBytesWritten()
Return number of compressed bytes output so far.


finish

void finish()
When called, indicates that compression should end with the current contents of the input buffer.


finished

boolean finished()
Returns true if the end of the compressed data output stream has been reached.

Returns:
true if the end of the compressed data output stream has been reached.

compress

int compress(byte[] b,
             int off,
             int len)
             throws IOException
Fills specified buffer with compressed data. Returns actual number of bytes of compressed data. A return value of 0 indicates that needsInput() should be called in order to determine if more input data is required.

Parameters:
b - Buffer for the compressed data
off - Start offset of the data
len - Size of the buffer
Returns:
The actual number of bytes of compressed data.
Throws:
IOException

reset

void reset()
Resets compressor so that a new set of input data can be processed.


end

void end()
Closes the compressor and discards any unprocessed input.


reinit

void reinit(Configuration conf)
Prepare the compressor to be used in a new stream with settings defined in the given Configuration

Parameters:
conf - Configuration from which new setting are fetched


Copyright © 2009 The Apache Software Foundation