Package org.apache.hadoop.typedbytes

Typed bytes are sequences of bytes in which the first byte is a type code.

See: Description

Package org.apache.hadoop.typedbytes Description

Typed bytes are sequences of bytes in which the first byte is a type code. They are especially useful as a (simple and very straightforward) binary format for transferring data to and from Hadoop Streaming programs.

Type Codes

Each typed bytes sequence starts with an unsigned byte that contains the type code. Possible values are:

CodeType
0A sequence of bytes.
1A byte.
2A boolean.
3An integer.
4A long.
5A float.
6A double.
7A string.
8A vector.
9A list.
10A map.

The type codes 50 to 200 are treated as aliases for 0, and can thus be used for application-specific serialization.

Subsequent Bytes

These are the subsequent bytes for the different type codes (everything is big-endian and unpadded):

CodeSubsequent Bytes
0<32-bit signed integer> <as many bytes as indicated by the integer>
1<signed byte>
2<signed byte (0 = false and 1 = true)>
3<32-bit signed integer>
4<64-bit signed integer>
5<32-bit IEEE floating point number>
6<64-bit IEEE floating point number>
7<32-bit signed integer> <as many UTF-8 bytes as indicated by the integer>
8<32-bit signed integer> <as many typed bytes sequences as indicated by the integer>
9<variable number of typed bytes sequences> <255 written as an unsigned byte>
10<32-bit signed integer> <as many (key-value) pairs of typed bytes sequences as indicated by the integer>

Copyright © 2015 Apache Software Foundation. All rights reserved.