Hadoop 2.6.4 Release Notes
These release notes include new developer and user-facing incompatibilities, features, and major improvements.
Changes since Hadoop 2.6.3
- YARN-4598.
Major bug reported by tangshangwen and fixed by tangshangwen (nodemanager)
Invalid event: RESOURCE_FAILED at CONTAINER_CLEANEDUP_AFTER_KILL
- YARN-4581.
Major bug reported by sandflee and fixed by sandflee (resourcemanager)
AHS writer thread leak makes RM crash while RM is recovering
- YARN-4546.
Critical bug reported by Jason Lowe and fixed by Jason Lowe (resourcemanager)
ResourceManager crash due to scheduling opportunity overflow
- YARN-4452.
Critical bug reported by Naganarasimha G R and fixed by Naganarasimha G R
NPE when submit Unmanaged application
- YARN-4414.
Major bug reported by Jason Lowe and fixed by Chang Li (nodemanager)
Nodemanager connection errors are retried at multiple levels
- YARN-4380.
Major bug reported by Tsuyoshi Ozawa and fixed by Varun Saxena (test)
TestResourceLocalizationService.testDownloadingResourcesOnContainerKill fails intermittently
- YARN-4354.
Blocker bug reported by Jason Lowe and fixed by Jason Lowe (nodemanager)
Public resource localization fails with NPE
- YARN-4180.
Critical bug reported by Anubhav Dhoot and fixed by Anubhav Dhoot (resourcemanager)
AMLauncher does not retry on failures when talking to NM
- YARN-3893.
Critical sub-task reported by Bibin A Chundatt and fixed by Bibin A Chundatt (resourcemanager)
Both RM in active state when Admin#transitionToActive failure from refeshAll()
- YARN-3857.
Critical bug reported by mujunchao and fixed by mujunchao (resourcemanager)
Memory leak in ResourceManager with SIMPLE mode
- YARN-3849.
Critical bug reported by Sunil G and fixed by Sunil G (capacityscheduler)
Too much of preemption activity causing continuos killing of containers across queues
- YARN-3842.
Critical bug reported by Karthik Kambatla and fixed by Robert Kanter
NMProxy should retry on NMNotYetReadyException
- YARN-3697.
Critical bug reported by zhihai xu and fixed by zhihai xu (fairscheduler)
FairScheduler: ContinuousSchedulingThread can fail to shutdown
- YARN-3695.
Major bug reported by Junping Du and fixed by Raju Bairishetti
ServerProxy (NMProxy, etc.) shouldn't retry forever for non network exception.
- YARN-3535.
Critical bug reported by Peng Zhang and fixed by Peng Zhang (capacityscheduler , fairscheduler , resourcemanager)
Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED
- YARN-3154.
Blocker sub-task reported by Xuan Gong and fixed by Xuan Gong (nodemanager , resourcemanager)
Should not upload partial logs for MR jobs or other "short-running' applications
Applications which made use of the LogAggregationContext in their application will need to revisit this code in order to make sure that their logs continue to get rolled out.
- YARN-2975.
Blocker bug reported by Karthik Kambatla and fixed by Karthik Kambatla
FSLeafQueue app lists are accessed without required locks
- YARN-2902.
Major sub-task reported by Jason Lowe and fixed by Varun Saxena (nodemanager)
Killing a container that is localizing can orphan resources in the DOWNLOADING state
- MAPREDUCE-6621.
Major bug reported by Xuan Gong and fixed by Xuan Gong
Memory Leak in JobClient#submitJobInternal()
- MAPREDUCE-6619.
Major bug reported by shanyu zhao and fixed by Junping Du (mrv2)
HADOOP_CLASSPATH is overwritten in MR container
- MAPREDUCE-6618.
Major bug reported by Xuan Gong and fixed by Xuan Gong
YarnClientProtocolProvider leaking the YarnClient thread.
- MAPREDUCE-6577.
Critical bug reported by Sangjin Lee and fixed by Sangjin Lee (mr-am)
MR AM unable to load native library without MR_AM_ADMIN_USER_ENV set
- MAPREDUCE-6554.
Critical bug reported by Bibin A Chundatt and fixed by Bibin A Chundatt
MRAppMaster servicestart failing with NPE in MRAppMaster#parsePreviousJobHistory
- MAPREDUCE-6492.
Critical bug reported by Bibin A Chundatt and fixed by Bibin A Chundatt
AsyncDispatcher exit with NPE on TaskAttemptImpl#sendJHStartEventForAssignedFailTask
- MAPREDUCE-6436.
Blocker improvement reported by Ryu Kobayashi and fixed by Kai Sasaki
JobHistory cache issue
- MAPREDUCE-6363.
Critical bug reported by Brahma Reddy Battula and fixed by Bibin A Chundatt (benchmarks)
[NNBench] Lease mismatch error when running with multiple mappers
- MAPREDUCE-5982.
Major bug reported by Jason Lowe and fixed by Chang Li (mr-am)
Task attempts that fail from the ASSIGNED state can disappear
- HDFS-9600.
Critical bug reported by Phil Yang and fixed by Phil Yang
do not check replication if the block is under construction
- HDFS-9574.
Major bug reported by Kihwal Lee and fixed by Kihwal Lee
Reduce client failures during datanode restart
- HDFS-9445.
Blocker bug reported by Kihwal Lee and fixed by Walter Su
Datanode may deadlock while handling a bad volume
- HDFS-9415.
Major improvement reported by Arpit Agarwal and fixed by Xiaobing Zhou (documentation)
Document dfs.cluster.administrators and dfs.permissions.superusergroup
- HDFS-9314.
Major improvement reported by Ming Ma and fixed by Xiao Chen
Improve BlockPlacementPolicyDefault's picking of excess replicas
- HDFS-9313.
Major bug reported by Ming Ma and fixed by Ming Ma
Possible NullPointerException in BlockManager if no excess replica can be chosen
- HDFS-9294.
Blocker bug reported by DENG FEI and fixed by Brahma Reddy Battula (hdfs-client)
DFSClient deadlock when close file and failed to renew lease
- HDFS-9220.
Blocker bug reported by Bogdan Raducanu and fixed by Jing Zhao
Reading small file (< 512 bytes) that is open for append fails due to incorrect checksum
- HDFS-9178.
Critical bug reported by Kihwal Lee and fixed by Kihwal Lee
Slow datanode I/O can cause a wrong node to be marked bad
- HDFS-8767.
Critical bug reported by Haohui Mai and fixed by Kanaka Kumar Avvaru
RawLocalFileSystem.listStatus() returns null for UNIX pipefile
- HDFS-8722.
Critical improvement reported by Kihwal Lee and fixed by Kihwal Lee
Optimize datanode writes for small writes and flushes
- HDFS-8647.
Major improvement reported by Ming Ma and fixed by Brahma Reddy Battula
Abstract BlockManager's rack policy into BlockPlacementPolicy
- HDFS-7694.
Major improvement reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe
FSDataInputStream should support "unbuffer"
- HDFS-6945.
Critical bug reported by Akira AJISAKA and fixed by Akira AJISAKA (namenode)
BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed
- HDFS-4660.
Blocker bug reported by Peng Zhang and fixed by Kihwal Lee (datanode)
Block corruption can happen during pipeline recovery
- HADOOP-12736.
Major test reported by Xiao Chen and fixed by Xiao Chen
TestTimedOutTestsListener#testThreadDumpAndDeadlocks sometimes times out
- HADOOP-12706.
Major bug reported by Jason Lowe and fixed by Sangjin Lee (test)
TestLocalFsFCStatistics#testStatisticsThreadLocalDataCleanUp times out occasionally
- HADOOP-12107.
Critical bug reported by Sangjin Lee and fixed by Sangjin Lee (fs)
long running apps may have a huge number of StatisticsData instances under FileSystem
- HADOOP-11252.
Critical bug reported by Wilfred Spiegelenburg and fixed by Masatake Iwasaki (ipc)
RPC client does not time out by default