|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object java.io.OutputStream org.apache.hadoop.io.compress.bzip2.CBZip2OutputStream
public class CBZip2OutputStream
An output stream that compresses into the BZip2 format (without the file header chars) into another stream.
The compression requires large amounts of memory. Thus you should call the
close()
method as soon as possible, to force
CBZip2OutputStream to release the allocated memory.
You can shrink the amount of allocated memory and maybe raise the compression speed by choosing a lower blocksize, which in turn may cause a lower compression ratio. You can avoid unnecessary memory allocation by avoiding using a blocksize which is bigger than the size of the input.
You can compute the memory usage for compressing by the following formula:
<code>400k + (9 * blocksize)</code>.
To get the memory required for decompression by CBZip2InputStream
use
<code>65k + (5 * blocksize)</code>.
Memory usage by blocksize | ||
---|---|---|
Blocksize | Compression memory usage | Decompression memory usage |
100k | 1300k | 565k |
200k | 2200k | 1065k |
300k | 3100k | 1565k |
400k | 4000k | 2065k |
500k | 4900k | 2565k |
600k | 5800k | 3065k |
700k | 6700k | 3565k |
800k | 7600k | 4065k |
900k | 8500k | 4565k |
For decompression CBZip2InputStream allocates less memory if the bzipped input is smaller than one block.
Instances of this class are not threadsafe.
TODO: Update to BZip2 1.0.1
Field Summary | |
---|---|
protected static int |
CLEARMASK
This constant is accessible by subclasses for historical purposes. |
protected static int |
DEPTH_THRESH
This constant is accessible by subclasses for historical purposes. |
protected static int |
GREATER_ICOST
This constant is accessible by subclasses for historical purposes. |
protected static int |
LESSER_ICOST
This constant is accessible by subclasses for historical purposes. |
static int |
MAX_BLOCKSIZE
The maximum supported blocksize == 9. |
static int |
MIN_BLOCKSIZE
The minimum supported blocksize == 1. |
protected static int |
QSORT_STACK_SIZE
This constant is accessible by subclasses for historical purposes. |
protected static int |
SETMASK
This constant is accessible by subclasses for historical purposes. |
protected static int |
SMALL_THRESH
This constant is accessible by subclasses for historical purposes. |
protected static int |
WORK_FACTOR
This constant is accessible by subclasses for historical purposes. |
Fields inherited from interface org.apache.hadoop.io.compress.bzip2.BZip2Constants |
---|
baseBlockSize, END_OF_BLOCK, END_OF_STREAM, G_SIZE, MAX_ALPHA_SIZE, MAX_CODE_LEN, MAX_SELECTORS, N_GROUPS, N_ITERS, NUM_OVERSHOOT_BYTES, rNums, RUNA, RUNB |
Constructor Summary | |
---|---|
CBZip2OutputStream(OutputStream out)
Constructs a new CBZip2OutputStream with a blocksize of 900k. |
|
CBZip2OutputStream(OutputStream out,
int blockSize)
Constructs a new CBZip2OutputStream with specified blocksize. |
Method Summary | |
---|---|
static int |
chooseBlockSize(long inputLength)
Chooses a blocksize based on the given length of the data to compress. |
void |
close()
|
protected void |
finalize()
Overriden to close the stream. |
void |
finish()
|
void |
flush()
|
int |
getBlockSize()
Returns the blocksize parameter specified at construction time. |
protected static void |
hbMakeCodeLengths(char[] len,
int[] freq,
int alphaSize,
int maxLen)
This method is accessible by subclasses for historical purposes. |
void |
write(byte[] buf,
int offs,
int len)
|
void |
write(int b)
|
Methods inherited from class java.io.OutputStream |
---|
write |
Methods inherited from class java.lang.Object |
---|
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int MIN_BLOCKSIZE
public static final int MAX_BLOCKSIZE
protected static final int SETMASK
protected static final int CLEARMASK
protected static final int GREATER_ICOST
protected static final int LESSER_ICOST
protected static final int SMALL_THRESH
protected static final int DEPTH_THRESH
protected static final int WORK_FACTOR
protected static final int QSORT_STACK_SIZE
If you are ever unlucky/improbable enough to get a stack overflow whilst sorting, increase the following constant and try again. In practice I have never seen the stack go above 27 elems, so the following limit seems very generous.
Constructor Detail |
---|
public CBZip2OutputStream(OutputStream out) throws IOException
Attention: The caller is resonsible to write the two BZip2 magic bytes "BZ" to the specified stream prior to calling this constructor.
out
- *
the destination stream.
IOException
- if an I/O error occurs in the specified stream.
NullPointerException
- if out == null
.public CBZip2OutputStream(OutputStream out, int blockSize) throws IOException
Attention: The caller is resonsible to write the two BZip2 magic bytes "BZ" to the specified stream prior to calling this constructor.
out
- the destination stream.blockSize
- the blockSize as 100k units.
IOException
- if an I/O error occurs in the specified stream.
IllegalArgumentException
- if (blockSize < 1) || (blockSize > 9)
.
NullPointerException
- if out == null
.MIN_BLOCKSIZE
,
MAX_BLOCKSIZE
Method Detail |
---|
protected static void hbMakeCodeLengths(char[] len, int[] freq, int alphaSize, int maxLen)
public static int chooseBlockSize(long inputLength)
inputLength
- The length of the data which will be compressed by
CBZip2OutputStream.
MIN_BLOCKSIZE
and
MAX_BLOCKSIZE
both inclusive. For a negative
inputLength this method returns MAX_BLOCKSIZE
always.public void write(int b) throws IOException
write
in class OutputStream
IOException
protected void finalize() throws Throwable
finalize
in class Object
Throwable
public void finish() throws IOException
IOException
public void close() throws IOException
close
in interface Closeable
close
in class OutputStream
IOException
public void flush() throws IOException
flush
in interface Flushable
flush
in class OutputStream
IOException
public final int getBlockSize()
public void write(byte[] buf, int offs, int len) throws IOException
write
in class OutputStream
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |