Apache Hadoop 3.0.0-beta1 Release Notes

These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.

Random access and seek improvements for the wasb:// (Azure) file system.

This fixes the LevelDB state store for the NodeManager. As of this patch, the state store versions now correspond to the following table.

In Hadoop common, fatal log level is changed to error because slf4j API does not support fatal log level.

WASB now includes the current Apache Hadoop version in the User-Agent string passed to Azure Blob service. Users also may include optional additional information to identify their application. See the documentation of configuration property fs.wasb.user.agent.id for further details.

Added org.apache.hadoop.yarn.webapp.hamlet2 package without _ as a one-character identifier. Please use this package instead of org.apache.hadoop.yarn.webapp.hamlet.

The metrics and MBeans introduced in HDFS-10999 have been renamed for brevity and clarity.

This patch changes how usage output is generated to now require a sub-command type. This allows users to see who the intended audience for a command is or it is a daemon.

FileStatus and FsPermission Writable serialization is deprecated and its implementation (incompatibly) replaced with protocol buffers. The FsPermissionProto record moved from hdfs.proto to acl.proto. HdfsFileStatus is now a subtype of FileStatus. FsPermissionExtension with its associated flags for ACLs, encryption, and erasure coding has been deprecated; users should query these attributes on the FileStatus object directly. The FsPermission instance in AclStatus no longer retains or reports these extended attributes (likely unused).

Bug fix to Azure Filesystem related to HADOOP-14535.

commons-logging dependency was removed from hadoop-yarn-server-applicationhistoryservice. If you rely on the transitive commons-logging dependency, please define the dependency explicitly.

Bug fix to Azure Filesystem related to HADOOP-14535

The size of the TCP socket buffers are no longer hardcoded by default. Instead the OS now will automatically tune the size for the buffer.

HDFS-6962 introduced POSIX ACL inheritance feature but it is disable by default. Now enable the feature by default. Please be aware any code expecting the old ACL inheritance behavior will have to be updated. Please see the HDFS Permissions Guide for further details.

Enables mapreduce.job.finish-when-all-reducers-done by default. With this enabled, a MapReduce job will complete as soon as all of its reducers are complete, even if some mappers are still running. This can occur if a mapper was relaunched after node failure but the relaunched task’s output is not actually needed. Previously the job would wait for all mappers to complete.

Configuration.dumpConfiguration no longer prints out the clear text values for the sensitive keys listed in hadoop.security.sensitive-config-keys. Callers can override the default list of sensitive keys either to redact more keys or print the clear text values for a few extra keys for debugging purpose.

New patch with changes in maven dependency. (removed apache xcerces as dependency)

The deprecated FileStatus::isDir method has been marked as final. FileSystems should override FileStatus::isDirectory

Up to 34% throughput improvement for the wasb:// (Azure) file system when fs.azure.selfthrottling.enable is false fs.azure.autothrottling.enable is true.

Recursive directory delete improvement for the wasb filesystem.

ResourceManager will now record ResourceRequests from different attempts into different objects.

The cell size of the provided HDFS erasure coding policies has been changed from 64k to 1024k for better performance. The policy names have all been changed accordingly, i.e. RS-6-3.1024k.

hdfs ec -listPolicies now lists enabled, disabled, and removed policies, rather than just enabled policies.

This adds some new job counters, the number of failed MAP/REDUCE tasks and the number of killed MAP/REDUCE tasks.

We are releasing the alpha2 version of a major revision of YARN Timeline Service: v.2. YARN Timeline Service v.2 addresses two major challenges: improving scalability and reliability of Timeline Service, and enhancing usability by introducing flows and aggregation.

YARN Timeline Service v.2 alpha1 was introduced in 3.0.0-alpha1 via YARN-2928.

YARN Timeline Service v.2 alpha2 is now being provided so that users and developers can test it and provide feedback and suggestions for making it a ready replacement for Timeline Service v.1.x. Security is provided via Kerberos Authentication and delegation tokens. There is also a simple read level authorization provided via whitelists.

Some of the notable improvements since alpha-1 are: - Security via Kerberos Authentication and delegation tokens - Read side simple authorization via whitelist - Client configurable entity sort ordering - New REST APIs for apps, app attempts, containers, fetching metrics by timerange, pagination, sub-app entities - Support for storing sub-application entities (entities that exist outside the scope of an application) - Configurable TTLs (time-to-live) for tables, configurable table prefixes, configurable hbase cluster - Flow level aggregations done as dynamic (table level) coprocessors - Uses latest stable HBase release 1.2.6

More details are available in the YARN Timeline Service v.2 documentation.

Additionally, this patch updates maven-site-plugin to 3.6 and doxia-module-markdown to 1.8-SNAPSHOT to work around problems with unknown schemas in URLs in markdown formatted content.

S3Guard (pronounced see-guard) is a new feature for the S3A connector to Amazon S3, which uses DynamoDB for a high performance and consistent metadata repository. Essentially: S3Guard caches directory information, so your S3A clients get faster lookups and resilience to inconsistency between S3 list operations and the status of objects. When files are created, with S3Guard, they’ll always be found.

S3Guard does not address update consistency: if a file is updated, while the directory information will be updated, calling open() on the path may still return the old data. Similarly, deleted objects may also potentially be opened.

Please consult the S3Guard documentation in the Amazon S3 section of our documentation.

Note: part of this update includes moving to a new version of the AWS SDK 1.11, one which includes the Dynamo DB client and its a shaded version of Jackson 2. The large aws-sdk-bundle JAR is needed to use the S3A client with or without S3Guard enabled. The good news: because Jackson is shaded, there will be no conflict between any Jackson version used in your application and that which the AWS SDK needs.

maven-site-plugin is no longer called directly at package phase.

NameNode now audit-logs getDelegationToken, renewDelegationToken, cancelDelegationToken.

This renames ClientProtocol#getECBlockGroupsStats to ClientProtocol#getEcBlockGroupStats and ClientProtocol#getBlockStats to ClientProtocol#getReplicatedBlockStats. The return-type classes have also been similarly renamed. Their fields have also been renamed to drop the “stats” suffix.

Additionally, ECBlockGroupStats#pendingDeletionBlockGroups has been renamed to ECBlockGroupStats#pendingDeletionBlocks, to clarify that this is the number of blocks, not block groups, pending deletion. The corresponding NameNode metric has also been renamed to PendingDeletionECBlocks.

S3A now defaults to using the “v2” S3 list API, which speeds up large-scale path listings. Non-AWS S3 implementations may not support this API: consult the S3A documentation on how to revert to the v1 API.

AMRMClient#waitFor and AMRMClientAsync#waitFor now take java.util.function.Supplier as an argument, rather than com.google.common.base.Supplier.

Block Compaction for Azure Block Blobs. When the number of blocks in a block blob is above 32000, the process of compaction replaces a sequence of small blocks with with one big block.

Changed {{stripedReadPool}} to unbounded cachedThreadPool. User should combine {{dfs.datanode.ec.reconstruction.stripedblock.threads}} and {{dfs.namenode.replication.max-streams}} to tune recovery performance.

dfs.namenode.ec.policies.enabled was removed in order to ensure there is only one approach to enable/disable erasure coding policies to avoid sync up.

Config key dfs.datanode.ec.reconstruction.stripedblock.threads.size has been renamed to dfs.datanode.ec.reconstruction.threads.

* The s3n:// client has been removed. Please upgrade to the s3a:// client. * The s3a’s original output stream has been removed, the “fast” output stream is the sole option available. There is no need to explicitly enable this, and trying to disable it (fs.s3a.fast.upload=false) will have no effect.

Persist all built-in erasure coding policies and user defined erasure coding policies into NameNode fsImage and editlog reliably, so that all erasure coding policies remain consistent after NameNode restart.

See RN in HDFS-7859 as part of the work.

CMake v3.1.0 is now the minimum version required to build Apache Hadoop’s native components.

Added new configuration “dfs.client.block.write.replace-datanode-on-failure.min-replication”.

The minimum number of replications that are needed to not to fail
  the write pipeline if new datanodes can not be found to replace
  failed datanodes (could be due to network failure) in the write pipeline.
  If the number of the remaining datanodes in the write pipeline is greater
  than or equal to this property value, continue writing to the remaining nodes.
  Otherwise throw exception.

  If this is set to 0, an exception will be thrown, when a replacement
  can not be found.

HdfsAdmin#addErasureCodingPolicies now returns an AddErasureCodingPolicyResponse[] rather than AddECPolicyResponse[]. The corresponding RPC definition and PB message have also been renamed.

This allows users to: * develop and plugin their own erasure codec and coders. The plugin will be loaded automatically from hadoop jars, the corresponding codec and coder will be registered for runtime use. * define their own erasure coding policies thru an xml file and CLI command. The added policies will be persisted into fsimage.

A federation-based approach to transparently scale a single YARN cluster to tens of thousands of nodes, by federating multiple YARN standalone clusters (sub-clusters). The applications running in this federated environment will see a single massive YARN cluster and will be able to schedule tasks on any node of the federated cluster. Under the hood, the federation system will negotiate with sub-clusters ResourceManagers and provide resources to the application. The goal is to allow an individual job to “span” sub-clusters seamlessly.

This extends the centralized YARN RM in to enable the scheduling of OPPORTUNISTIC containers in a centralized fashion. This way, users can use OPPORTUNISTIC containers to improve the cluster’s utilization, without needing to enable distributed scheduling.