These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.
ProtocolBuffer is packaged in Ubuntu
WARNING: No release note provided for this incompatible change.
Apache Curator version change: Apache Hadoop has updated the version of Apache Curator used from 2.6.0 to 2.7.1. This change should be binary and source compatible for the majority of downstream users. Notable exceptions:
Downstream users are reminded that while the Hadoop community will attempt to avoid egregious incompatible dependency changes, there is currently no policy around when Hadoop’s exposed dependencies will change across versions (ref https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Java_Classpath ).
We have reinstated support for launching Hadoop processes on Windows by using Cygwin to run the shell scripts. All processes still must have access to the native components: hadoop.dll and winutils.exe.
The following parameters are introduced in this JIRA: fs.s3a.threads.max: the maximum number of threads to allow in the pool used by TransferManager fs.s3a.threads.core: the number of threads to keep in the pool used by TransferManager fs.s3a.threads.keepalivetime: when the number of threads is greater than the core, this is the maximum time that excess idle threads will wait for new tasks before terminating fs.s3a.max.total.tasks: the maximum number of tasks that the LinkedBlockingQueue can hold
WARNING: No release note provided for this incompatible change.
Keys with uppercase names can no longer be created when using the JavaKeyStoreProvider to resolve ambiguity about case-sensitivity in the KeyStore spec.
WARNING: No release note provided for this incompatible change.
Hadoop metrics sent to Ganglia over multicast now support optional configuration of socket TTL. The default TTL is 1, which preserves the behavior of prior Hadoop versions. Clusters that span multiple subnets/VLANs will likely want to increase this.
The Hadoop Common native components now support 32-bit build targets on Windows.
Hadoop now supports integration with Azure Storage as an alternative Hadoop Compatible File System.
Added a section to BUILDING.txt on how to install required / optional packages on a clean install of Ubuntu 14.04 LTS Desktop.
Went through the CMakeLists.txt files in the repo and added the following optional library dependencies - Snappy, Bzip2, Linux FUSE and Jansson.
Updated the required packages / version numbers from the trunk branch version of BUILDING.txt.
New fs -find command
This fix moves the public class StorageType from the package org.apache.hadoop.hdfs to org.apache.hadoop.fs.
LibHDFS now supports 32-bit build targets on Windows.
This change introduces a new configuration key used to throttle decommissioning work, “dfs.namenode.decommission.blocks.per.interval”. This new key overrides and deprecates the previous related configuration key “dfs.namenode.decommission.nodes.per.interval”. The new key is intended to result in more predictable pause times while scanning decommissioning nodes.
Introduced a new configuration dfs.pipeline.ecn. When the configuration is turned on, DataNodes will signal in the writing pipelines when they are overloaded. The client can back off based on this congestion signal to avoid overloading the system.
WARNING: No release note provided for this incompatible change.
WARNING: No release note provided for this incompatible change.
WARNING: No release note provided for this incompatible change.
Add a feature for replica pinning so that when a replica is pinned in a datanode, it will not be moved by Balancer/Mover. The replica pinning feature can be enabled/disabled by “dfs.datanode.block-pinning.enabled”, where the default is false.
This merges Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one constant. Hard-coded literals of “blk_” in various files are also updated to use the same constant.
Based on the reconfiguration framework provided by HADOOP-7001, enable reconfigure the dfs.datanode.data.dir and add new volumes into service.
This introduces two new MR2 job configs, mentioned below, which allow users to control the maximum simultaneously-running tasks of the submitted job, across the cluster:
This is controllable at a per-job level.
Removed commons-httpclient dependency from hadoop-yarn-server-web-proxy module.
Applications which made use of the LogAggregationContext in their application will need to revisit this code in order to make sure that their logs continue to get rolled out.