Apache Hadoop 0.20.205.0 Release Notes

These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.


Fixed hadoop-setup-conf.sh to put proxy user in core-site.xml. (Arpit Gupta via Eric Yang)


Added parameter for HBase user to setup config script. (Arpit Gupta via Eric Yang)


Removed unnecessary security logger configuration. (Eric Yang)


Fixed recursive sourcing of HADOOP_OPTS environment variables (Arpit Gupta via Eric Yang)


Fixed hadoop-setup-conf.sh to handle config file consistently. (Eric Yang)


Added toggle for dfs.support.append, webhdfs and hadoop proxy user to setup config script. (Arpit Gupta via Eric Yang)


Fixed conflict uid for install packages. (Eric Yang)


Added init.d script for jobhistory server and secondary namenode. (Eric Yang)


HADOOP-7681. Fixed security and hdfs audit log4j properties (Arpit Gupta via Eric Yang)


Committed to trunk and v23, since code reviewed by Eric.


Set hdfs uid, mapred uid, and hadoop gid to fixed numbers (201, 202, and 123, respectively).


Adding support for Kerberos HTTP SPNEGO authentication to the Hadoop web-consoles


Give meaningful error message instead of NPE.


Added a conf property dfs.webhdfs.enabled for enabling/disabling webhdfs.


Added two new conf properties dfs.web.authentication.kerberos.principal and dfs.web.authentication.kerberos.keytab for the SPNEGO servlet filter.


New dfsadmin command added: [-setBalancerBandwidth <bandwidth>] where bandwidth is max network bandwidth in bytes per second that the balancer is allowed to use on each datanode during balacing.

This is an incompatible change in 0.23. The versions of ClientProtocol and DatanodeProtocol are changed.


Change recoverLease API to return if the file is closed or not. It also change the semantics of recoverLease to start lease recovery immediately.


WARNING: No release note provided for this incompatible change.


Removed inheritance of certain server environment variables (HADOOP_OPTS and HADOOP_ROOT_LOGGER) in task attempt process.


contrib/vaidya/bin/vaidya.sh script fixed to use appropriate jars and classpath


Adds cumulative cpu usage and total heap usage to task counters. This is a backport of MAPREDUCE-220 and MAPREDUCE-2469.


Generalizes token renewal and canceling to a common interface and provides a plugin interface for adding renewers for new kinds of tokens. Hftp changed to store the tokens as HFTP and renew them over http.


Added config option mapreduce.tasktracker.cache.local.keep.pct to the TaskTracker. It is the target percentage of the local distributed cache that should be kept in between garbage collection runs. In practice it will delete unused distributed cache entries in LRU order until the size of the cache is less than mapreduce.tasktracker.cache.local.keep.pct of the maximum cache size. This is a floating point value between 0.0 and 1.0. The default is 0.95.


I just committed this. Thanks Anupam!