These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.
The “hadoop classpath” command has been enhanced to support options for automatic expansion of wildcards in classpath elements and writing the classpath to a jar file manifest. These options make it easier to construct a correct classpath for libhdfs applications.
The MetricsSystem abstract class has added a new abstract method, unregisterSource, for unregistering a previously registered metrics source. Custom subclasses of MetricsSystem must be updated to provide an implementation of this method.
Remove unnecessary synchronized blocks from Snappy/Zlib codecs.
bin/hadoop key with no args would throw an NPE.
Fix of inappropriate test of delete functionality.
Implements -h option for fs -count to show file sizes in human readable format. Additionally, ContentSummary.getHeader() now returns a different string that is incompatible with previous releases.
This change enables the TCP_NODELAY flag for all Hadoop IPC connections, hence bypassing TCP Nagling. Nagling interacts poorly with TCP delayed ACKs especially for request-response protocols.
The following configuration properties are added.
dfs.client.write.byte-array-manager.enabled: for enabling/disabling byte array manger. Default is false.
dfs.client.write.byte-array-manager.count-threshold: The count threshold for each array length so that a manager is created only after the allocation count exceeds the threshold. In other words, the particular array length is not managed until the allocation count exceeds the threshold. Default is 128.
dfs.client.write.byte-array-manager.count-limit: The maximum number of arrays allowed for each array length. Default is 2048.
dfs.client.write.byte-array-manager.count-reset-time-period-ms: The time period in milliseconds that the allocation count for each array length is reset to zero if there is no increment. Default is 10,000ms, i.e. 10 seconds.
HDFS now supports the option to configure AES encryption for block data transfer. AES offers improved cryptographic strength and performance over the prior options of 3DES and RC4.
The directory structure for finalized replicas on DNs has been changed. Now, the directory that a finalized replica goes in is determined uniquely by its ID. Specifically, we use a two-level directory structure, with the 24th through 17th bits identifying the correct directory at the first level and the 16th through 8th bits identifying the correct directory at the second level.
Allow distcp to copy data between HA clusters. Users can use a new configuration property “dfs.internal.nameservices” to explicitly specify the name services belonging to the local cluster, while continue using the configuration property “dfs.nameservices” to specify all the name services in the local and remote clusters.
SASL now can be used to secure the DataTransferProtocol, which transfers file block content between HDFS clients and DataNodes. In this configuration, it is no longer required for secured clusters to start the DataNode as root and bind to privileged ports.
The libhdfs C API is now supported on Windows.
WARNING: No release note provided for this incompatible change.
WARNING: No release note provided for this incompatible change.
WARNING: No release note provided for this incompatible change.