Apache Hadoop 0.20.204.0 Release Notes

These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.


Batch hardlinking during “upgrade” snapshots, cutting time from aprx 8 minutes per volume to aprx 8 seconds. Validated in both Linux and Windows. Depends on prior integration with patch for HADOOP-7133.


Added mapreduce.tasktracker.distributedcache.checkperiod to the task tracker that defined the period to wait while cleaning up the distributed cache. The default is 1 min.


Added RPM/DEB packages to build system.


Added a new configuration option: mapreduce.reduce.shuffle.maxfetchfailures, and removed a no longer used option: mapred.reduce.copy.backoff.


Added 2 new config parameters:

mapreduce.reduce.shuffle.catch.exception.stack.regex mapreduce.reduce.shuffle.catch.exception.message.regex


Test case TestHdfsProxy.testHdfsProxyInterface has been temporarily disabled for this release, due to failure in the Hudson automated test environment.


Fixed a race condition in writing the log index file that caused tasks to ‘fail’.


Removed duplicate chmods of job log dir that were vulnerable to race conditions between tasks. Also improved the messages when the symlinks failed to be created.