Apache Hadoop 3.3.2 Release Notes

These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.

Added a new BlockPlacementPolicy: “AvailableSpaceRackFaultTolerantBlockPlacementPolicy” which uses the same optimization logic as the AvailableSpaceBlockPlacementPolicy along with spreading the replicas across maximum number of racks, similar to BlockPlacementPolicyRackFaultTolerant. The BPP can be configured by setting the blockplacement policy class as org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceRackFaultTolerantBlockPlacementPolicy

Dependency on HTrace and TraceAdmin protocol/utility were removed. Tracing functionality is no-op until alternative tracer implementation is added.

WARNING: No release note provided for this change.

Added syncronization so that the “yarn node list” command does not fail intermittently

Adds support for client side encryption in AWS S3, with keys managed by AWS-KMS.

Read the documentation in encryption.md very, very carefully before use and consider it unstable.

S3-CSE is enabled in the existing configuration option “fs.s3a.server-side-encryption-algorithm”:

fs.s3a.server-side-encryption-algorithm=CSE-KMS fs.s3a.server-side-encryption.key=<KMS_KEY_ID>

You cannot enable CSE and SSE in the same client, although you can still enable a default SSE option in the S3 console.

* Not compatible with S3Guard.
* Filesystem list/get status operations subtract 16 bytes from the length of all files >= 16 bytes long to compensate for the padding which CSE adds. * The SDK always warns about the specific algorithm chosen being deprecated. It is critical to use this algorithm for ranged GET requests to work (i.e. random IO). Ignore. * Unencrypted files CANNOT BE READ. The entire bucket SHOULD be encrypted with S3-CSE. * Uploading files may be a bit slower as blocks are now written sequentially. * The Multipart Upload API is disabled when S3-CSE is active.

When Timeline Service V1 or V1.5 is used, if “yarn.resourcemanager.system-metrics-publisher.timeline-server-v1.enable-batch” is set to true, ResourceManager sends timeline events in batch. The default value is false. If this functionality is enabled, the maximum number that events published in batch is configured by “yarn.resourcemanager.system-metrics-publisher.timeline-server-v1.batch-size”. The default value is 1000. The interval of publishing events can be configured by “yarn.resourcemanager.system-metrics-publisher.timeline-server-v1.interval-seconds”. By default, it is set to 60 seconds.