Apache Hadoop 0.20.204.0 Release Notes

These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.


Added RPM/DEB packages to build system.


Test case TestHdfsProxy.testHdfsProxyInterface has been temporarily disabled for this release, due to failure in the Hudson automated test environment.


Batch hardlinking during “upgrade” snapshots, cutting time from aprx 8 minutes per volume to aprx 8 seconds. Validated in both Linux and Windows. Depends on prior integration with patch for HADOOP-7133.


Fixed a race condition in writing the log index file that caused tasks to ‘fail’.


Removed duplicate chmods of job log dir that were vulnerable to race conditions between tasks. Also improved the messages when the symlinks failed to be created.


Added 2 new config parameters:

mapreduce.reduce.shuffle.catch.exception.stack.regex mapreduce.reduce.shuffle.catch.exception.message.regex


Added a new configuration option: mapreduce.reduce.shuffle.maxfetchfailures, and removed a no longer used option: mapred.reduce.copy.backoff.


Added mapreduce.tasktracker.distributedcache.checkperiod to the task tracker that defined the period to wait while cleaning up the distributed cache. The default is 1 min.