Apache Hadoop 3.0.0-alpha1 Release Notes

These release notes cover new developer and user-facing incompatibilities, important issues, features, and major improvements.

Change default namespace quota of root directory from Integer.MAX_VALUE to Long.MAX_VALUE.

Remove the deprecated FSDataOutputStream constructor, FSDataOutputStream.sync() and Syncable.sync().

Remove the deprecated DFSOutputStream.sync() method.

Documented that the “fs -getmerge” shell command may not work properly over non HDFS-filesystem implementations due to platform-varying file list ordering.

test-patch.sh adds a new option “–build-native”. When set to false native components are not built. When set to true native components are built. The default value is true.

This change affects wire-compatibility of the NameNode/DataNode heartbeat protocol. Only present in 3.0.0-alpha1. It has been reverted before 3.0.0-alpha2

Support for hftp and hsftp has been removed. They have superseded by webhdfs and swebhdfs.

Deprecated and unused classes in the org.apache.hadoop.record package have been removed from hadoop-streaming.

Appends in HDFS can no longer be disabled.

The classes in org.apache.hadoop.record are moved from hadoop-common to a new hadoop-streaming artifact within the hadoop-tools module.

The Hadoop shell scripts have been rewritten to fix many long standing bugs and include some new features. While an eye has been kept towards compatibility, some changes may break existing installations.





This changes the output of the ‘hadoop version’ command to generically say ‘Source code repository’ rather than specify which type of repo.

Fix a typo. If a configuration is set through program, the source of the configuration is set to ‘programmatically’ instead of ‘programatically’ now.

Adds a native implementation of the map output collector. The native library will build automatically with -Pnative. Users may choose the new collector on a job-by-job basis by setting mapreduce.job.map.output.collector.class=org.apache.hadoop.mapred. nativetask.NativeMapOutputCollectorDelegator in their job configuration. For shuffle-intensive jobs this may provide speed-ups of 30% or more.

org.apache.hadoop.fs.permission.AccessControlException was deprecated in the last major release, and has been removed in favor of org.apache.hadoop.security.AccessControlException

.hadooprc allows users a convenient way to set and/or override the shell level settings.

The memory values for mapreduce.map/reduce.memory.mb keys, if left to their default values of -1, will now be automatically inferred from the heap size value system property (-Xmx) specified for mapreduce.map/reduce.java.opts keys.

The converse is also done, i.e. if mapreduce.map/reduce.memory.mb values are specified, but no -Xmx is supplied for mapreduce.map/reduce.java.opts keys, then the -Xmx value will be derived from the former’s value.

If neither is specified, then a default value of 1024 MB gets used.

For both these conversions, a scaling factor specified by property mapreduce.job.heap.memory-mb.ratio is used, to account for overheads between heap usage vs. actual physical memory usage.

Existing configs or job code that already specify both the set of properties explicitly would not be affected by this inferring change.

The user ‘yarn’ is no longer allowed to run tasks for security reasons.

Old New

Added -v option to fs -count command to display a header record in the report.

Support for shell profiles has been added. They allow for easy integration of additional functionality, classpaths, and more from inside the bash scripts rather than relying upon modifying hadoop-env.sh, etc. See the Unix Shell Guide for more information.

Options to sort output of fs -ls comment: -t (mtime), -S (size), -u (atime), -r (reverse)

The hadoop kerbname subcommand has been added to ease operational pain in determining the output of auth_to_local rules.

This deprecates the following environment variables:

Old New

Prior to this change, distcp had hard-coded values for memory usage. Now distcp will honor memory settings in a way compatible with the rest of MapReduce.

The output of du has now been made more Unix-like, with aligned output.

Remove “downgrade” from “namenode -rollingUpgrade” startup option since it may incorrectly finalize an ongoing rolling upgrade.

The output format of hadoop fs -du has been changed. It shows not only the file size but also the raw disk usage including the replication factor.

Use low latency TCP connections for hadoop IPC

Jars in the various subproject lib directories are now de-duplicated against Hadoop common. Users who interact directly with those directories must be sure to pull in common’s dependencies as well.

Applications which made use of the LogAggregationContext in their application will need to revisit this code in order to make sure that their logs continue to get rolled out.

Add posixGroups support for LDAP groups mapping service. The change in LDAPGroupMapping is compatible with previous scenario. In LDAP, the group mapping between {{posixAccount}} and {{posixGroup}} is different from the general LDAPGroupMapping, one of the differences is the {{“memberUid”}} will be used to mapping {{posixAccount}} and {{posixGroup}}. The feature will handle the mapping in internal when configuration {{hadoop.security.group.mapping.ldap.search.filter.user}} is set as “posixAccount” and {{hadoop.security.group.mapping.ldap.search.filter.group}} is “posixGroup”.

Now “mapred job -list” command displays the Job Name as well.

FairScheduler does not allow queue names with leading or tailing spaces or empty sub-queue names anymore.

WebHDFS is mandatory and cannot be disabled.

Stopping the namenode on secure systems now requires the user be authenticated.

Python is now required to build the documentation.

Fixed a bug where the StandbyNameNode’s TransactionsSinceLastCheckpoint metric may slide into a negative number after every subsequent checkpoint.

Add support for aarch64 CRC instructions

Adding support for using the ‘tc’ tool in batch mode via container-executor. This is a prerequisite for traffic-shaping functionality that is necessary to support outbound bandwidth as a resource in YARN.

Now auto-downloads patch from issue-id; fixed race conditions; fixed bug affecting some patches.

The current cgroups implementation is closely tied to supporting CPU as a resource . This patch separates out CGroups implementation into a reusable class as well as provides a simple ResourceHandler subsystem that will enable us to add support for new resource types on the NM - e.g Network, Disk etc.

NameNode and DataNode now abort during startup if attempting to run in secure mode, but block access tokens are not enabled by setting configuration property dfs.block.access.token.enable to true in hdfs-site.xml. Previously, this case logged a warning, because this would be an insecure configuration.

ResourceManager renews delegation tokens for applications. This behavior has been changed to renew tokens only if the token’s renewer is a non-empty string. MapReduce jobs can instruct ResourceManager to skip renewal of tokens obtained from certain hosts by specifying the hosts with configuration mapreduce.job.hdfs-servers.token-renewal.exclude=<host1>,<host2>,..,<hostN>.

1) A TrafficController class that provides an implementation for traffic shaping using tc. 2) A ResourceHandler implementation for OutboundBandwidth as a resource - isolation/enforcement using cgroups and tc.

io.native.lib.available was removed. Always use native libraries if they exist.

Includes a docker based solution for setting up a build environment with minimal effort.

The patch improves the reporting around missing blocks and corrupted blocks.

  1. A block is missing if and only if all DNs of its expected replicas are dead.
  2. A block is corrupted if and only if all its available replicas are corrupted. So if a block has 3 replicas; one of the DN is dead, the other two replicas are corrupted; it will be marked as corrupted.
  3. A new line is added to fsck output to display the corrupt block size per file.
  4. A new line is added to fsck output to display the number of missing blocks in the summary section.

Use today instead of ‘Unreleased’ in releasedocmaker.py when –usetoday is given as an option.

Non-HA rollback steps have been changed. Run the rollback command on the namenode (`bin/hdfs namenode -rollback`) before starting cluster with ‘-rollback’ option using (sbin/start-dfs.sh -rollback).

A partitioner is now only created if there are multiple reducers.

Remove -finalize option from hdfs namenode command.

Specific HDFS ops can be selectively excluded from audit logging via ‘dfs.namenode.audit.log.debug.cmdlist’ configuration.

This change requires setting the dfs.datanode.max.locked.memory configuration key to use the HDFS Lazy Persist feature. Its value limits the combined off-heap memory for blocks in RAM via caching and lazy persist writes.

Users may need special attention for this change while upgrading to this version. Previously user could call some APIs(example: setReplication) wrongly even after closing the fs object. With this change DFS client will not allow any operations to call on closed fs objects. As calling fs operations on closed fs is not right thing to do, users need to correct the usage if any.

Removed DistCpV1 and Logalyzer.

The Client#call() methods that are deprecated since 0.23 have been removed.

Modifying key methods in ContainerExecutor to use context objects instead of an argument list. This is more extensible and less brittle.

Fix FairScheduler’s REST api returns a missing ‘[’ blacket JSON for childQueues.

The FSConstants class has been deprecated since 0.23 and it is removed in the release.

mapreduce.fileoutputcommitter.algorithm.version now defaults to 2.

In algorithm version 1:

  1. commitTask renames directory $joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/ to $joboutput/_temporary/$appAttemptID/$taskID/

  2. recoverTask renames $joboutput/_temporary/$appAttemptID/$taskID/ to $joboutput/_temporary/($appAttemptID + 1)/$taskID/

  3. commitJob merges every task output file in $joboutput/_temporary/$appAttemptID/$taskID/ to $joboutput/, then it will delete $joboutput/_temporary/ and write $joboutput/_SUCCESS

commitJob’s run time, number of RPC, is O(n) in terms of output files, which is discussed in MAPREDUCE-4815, and can take minutes.

Algorithm version 2 changes the behavior of commitTask, recoverTask, and commitJob.

  1. commitTask renames all files in $joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/ to $joboutput/

  2. recoverTask is a nop strictly speaking, but for upgrade from version 1 to version 2 case, it checks if there are any files in $joboutput/_temporary/($appAttemptID - 1)/$taskID/ and renames them to $joboutput/

  3. commitJob deletes $joboutput/_temporary and writes $joboutput/_SUCCESS

Algorithm 2 takes advantage of task parallelism and makes commitJob itself O(1). However, the window of vulnerability for having incomplete output in $jobOutput directory is much larger. Therefore, pipeline logic for consuming job outputs should be built on checking for existence of _SUCCESS marker.

Removed consumption of the MAX_APP_ATTEMPTS_ENV environment variable

“Permission denied” error message when unable to read local file for -put/copyFromLocal

Zookeeper jar removed from hadoop-client dependency tree.

Public service notice: * Every restart of a 2.6.x or 2.7.0 DN incurs a risk of unwanted block deletion. * Apply this patch if you are running a pre-2.7.1 release.

Proxy level retries will not be done on AlreadyBeingCreatedExeption for create() op.

The behavior of shutdown a NM could be different (if NM work preserving is not enabled): NM will unregister to RM immediately rather than waiting for timeout to be LOST. A new status of NodeStatus - SHUTDOWN is involved which could affect UI, CLI and ClusterMetrics for node’s status.

WARNING: No release note provided for this change.

Related to the decommission enhancements in HDFS-7411, this change removes the deprecated configuration key “dfs.namenode.decommission.nodes.per.interval” which has been subsumed by the configuration key “dfs.namenode.decommission.blocks.per.interval”.

Existing sequence files can be appended.

Add a new option “properties” to the “dfsadmin -reconfig” command to get a list of reconfigurable properties.

Users may need special attention for this change while upgrading to this version. Previously hdfs client was using commons-logging as the logging framework. With this change it will use slf4j framework. For more details about slf4j, please see: http://www.slf4j.org/manual.html. Also, org.apache.hadoop.hdfs.protocol.CachePoolInfo#LOG public static member variable has been removed as it is not used anywhere. Users need to correct their code if any one has a reference to this variable. One can retrieve the named logger via the logging framework of their choice directly like, org.slf4j.Logger LOG = org.slf4j.LoggerFactory.getLogger(org.apache.hadoop.hdfs.protocol.CachePoolInfo.class);

This feature adds support for running additional standby NameNodes, which provides additional fault-tolerance. It is designed for a total of 3-5 NameNodes.

Default value for ‘yarn.scheduler.maximum-allocation-vcores’ changed from 32 to 4.

Added SFTP filesystem by using the JSch library.

Documented missing properties and added the regression test to verify that there are no missing properties in yarn-default.xml.

There is a typo in the event string “WORKFLOW_ID” (as “WORKLFOW_ID”). The branch-2 change will publish both event strings for compatibility with consumers, but the misspelled metric will be removed in trunk.

WARNING: No release note provided for this change.

Limit on Maximum number of ACL entries(32) will be enforced separately on access and default ACLs. So in total, max. 64 ACL entries can be present in a ACL spec.

The Maven dependency on aws-sdk has been changed to aws-sdk-s3 and the version bumped. Applications depending on transitive dependencies pulled in by aws-sdk and not aws-sdk-s3 might not work.

This removes the deprecated DistributedFileSystem#getFileBlockStorageLocations API used for getting VolumeIds of block replicas. Applications interested in the volume of a replica can instead consult BlockLocation#getStorageIds to obtain equivalent information.

Fixes an Trash related issue wherein a delay in the periodic checkpointing of one user’s directory causes the subsequent user directory checkpoints to carry a newer timestamp, thereby delaying their eventual deletion.

The config key “dfs.namenode.fs-limits.max-xattr-size” can no longer be set to a value of 0 (previously used to indicate unlimited) or a value greater than 32KB. This is a constraint on xattr size similar to many local filesystems.

Adds a new blockpools flag to the balancer. This allows admins to specify which blockpools the balancer will run on. Usage: -blockpools <comma-separated list of blockpool ids> The balancer will only run on blockpools included in this list.

getSoftwareVersion method would replace original getVersion method, which returns the version string.

The new getVersion method would return both version string and revision string.

Set YARN_FAIL_FAST to be false by default. If HA is enabled and if there’s any state-store error, after the retry operation failed, we always transition RM to standby state.

An option ‘-d’ added for all command-line copy commands to skip intermediate ‘.COPYING’ file creation.

Exposed a metric ‘LastJournalTimestamp’ for JournalNode

Exposed command “-getBalancerBandwidth” in dfsadmin to get the bandwidth of balancer.

Yarn now only issues and allows delegation tokens in secure clusters. Clients should no longer request delegation tokens in a non-secure cluster, or they’ll receive an IOException.

HDFS-8829 introduces two new configuration settings: dfs.datanode.transfer.socket.send.buffer.size and dfs.datanode.transfer.socket.recv.buffer.size. These settings can be used to control the socket send buffer and receive buffer sizes respectively on the DataNode for client-DataNode and DataNode-DataNode connections. The default values of both settings are 128KB for backwards compatibility. For optimum performance it is recommended to set these values to zero to enable the OS networking stack to auto-tune buffer sizes.

After this patch, the feature to support NM resource dynamically configuration is completed, so that user can configure NM with new resource without bring NM down or decommissioned. Two CLIs are provided to support update resources on individual node or a batch of nodes: 1. Update resource on single node: yarn rmadmin -updateNodeResource [NodeID] [MemSize] [vCores] 2. Update resource on a batch of nodes: yarn rmadmin -refreshNodesResources, that reflect nodes’ resource configuration defined in dynamic-resources.xml which is loaded by RM dynamically (like capacity-scheduler.xml or fair-scheduler.xml). The first version of configuration format is: <configuration> <property> <name>yarn.resource.dynamic.nodes</name> <value>h1:1234</value> </property> <property> <name>yarn.resource.dynamic.h1:1234.vcores</name> <value>16</value> </property> <property> <name>yarn.resource.dynamic.h1:1234.memory</name> <value>1024</value> </property> </configuration>

Now trash message is not printed to System.out. It is handled by Logger instead.

The jira made the following changes: 1. Fix a bug to exclude newly-created files from quota usage calculation for a snapshot path. 2. Number of snapshots is no longer counted as directory number in getContentSummary result.

Added StatsD metrics2 sink

NameNodeMXBean#getNNStarted() metric is deprecated in branch-2 and removed from trunk.

HADOOP-12437 introduces two new configuration settings: hadoop.security.dns.interface and hadoop.security.dns.nameserver. These settings can be used to control how Hadoop service instances look up their own hostname and may be required in some multi-homed environments where hosts are configured with multiple hostnames in DNS or hosts files. They supersede the existing settings dfs.datanode.dns.interface and dfs.datanode.dns.nameserver.

FileSystem#createNonRecursive() is undeprecated.

Introduced two new configuration dfs.webhdfs.netty.low.watermark and dfs.webhdfs.netty.high.watermark to enable tuning the size of the buffers of the Netty server inside Datanodes.

HDFS now provides native support for erasure coding (EC) to store data more efficiently. Each individual directory can be configured with an EC policy with command hdfs erasurecode -setPolicy. When a file is created, it will inherit the EC policy from its nearest ancestor directory to determine how its blocks are stored. Compared to 3-way replication, the default EC policy saves 50% of storage space while also tolerating more storage failures.

To support small files, the currently phase of HDFS-EC stores blocks in striped layout, where a logical file block is divided into small units (64KB by default) and distributed to a set of DataNodes. This enables parallel I/O but also decreases data locality. Therefore, the cluster environment and I/O workloads should be considered before configuring EC policies.

The output of the “hdfs fetchdt –print” command now includes the token renewer appended to the end of the existing token information. This change may be incompatible with tools that parse the output of the command.

In the extremely rare event that HADOOP_USER_IDENT and USER environment variables are not defined, we now fall back to use ‘hadoop’ as the identification string.

When Hadoop JVMs create other processes on OS X, it will always use posix_spawn.

The output of fsck command for being written hdfs files had been changed. When using fsck against being written hdfs files with {{-openforwrite}} and {{-files -blocks -locations}}, the fsck output will include the being written block for replication files or being written block group for erasure code files.

The preferred block size XML element has been corrected from “\<perferredBlockSize>” to “\<preferredBlockSize>”.

GlobFilter and RegexFilter.compile() now returns com.google.re2j.pattern.Pattern instead of java.util.regex.Pattern

The feature needs to enabled by setting “hadoop.caller.context.enabled” to true. When the feature is used, additional fields are written into namenode audit log records.

Introduces a new configuration setting dfs.client.socket.send.buffer.size to control the socket send buffer size for writes. Setting it to zero enables TCP auto-tuning on systems that support it.

There is now support for offloading HA health check RPC activity to a separate RPC server endpoint running within the NameNode process. This may improve reliability of HA health checks and prevent spurious failovers in highly overloaded conditions. For more details, please refer to the hdfs-default.xml documentation for properties dfs.namenode.lifeline.rpc-address, dfs.namenode.lifeline.rpc-bind-host and dfs.namenode.lifeline.handler.count.

Projects that access HDFS can depend on the hadoop-hdfs-client module instead of the hadoop-hdfs module to avoid pulling in unnecessary dependency. Please note that hadoop-hdfs-client module could miss class like ConfiguredFailoverProxyProvider. So if a cluster is in HA deployment, we should still use hadoop-hdfs instead.

The following shell environment variables have been deprecated:

Old New

In addition:

Snapshots can be allowed/disallowed on a directory via WebHdfs from users with superuser privilege.

Previously, the MR job will get failed if AM get restarted for some reason (like node failure, etc.) during its doing commit job no matter if AM attempts reach to the maximum attempts. In this improvement, we add a new API isCommitJobRepeatable() to OutputCommitter interface which to indicate if job’s committer can do commitJob again if previous commit work is interrupted by NM/AM failures, etc. The instance of OutputCommitter, which support repeatable job commit (like FileOutputCommitter in algorithm 2), can allow AM to continue the commitJob() after AM restart as a new attempt.

The support of the deprecated dfs.umask key is removed in Hadoop 3.0.

SortedMapWritable has changed to SortedMapWritable<K extends WritableComparable<? super K>>. That way user can declare the class by such as SortedMapWritable<Text>.

Allow stop() before start() completed in JvmPauseMonitor

Unify the behavior of dfs.getEZForPath() API when getting a non-existent normal file and non-existent ezone file by throwing FileNotFoundException

Now TotalFiles metric is removed from FSNameSystem. Use FilesTotal instead.

Only check permissions when permissions enabled in FSDirStatAndListingOp.getFileInfo() and getListingInt()

Add Trash support for deleting files within encryption zones. Deleted files will remain encrypted and they will be moved to a “.Trash” subdirectory under the root of the encryption zone, prefixed by $USER/current. Checkpoint and expunge continue to work like the existing Trash.

Steps to reconfigure: 1. change value of the parameter in corresponding xml configuration file 2. to reconfigure, run hdfs dfsadmin -reconfig datanode <dn_addr>:<ipc_port> start 3. repeat step 2 until all DNs are reconfigured 4. to check status of the most recent reconfigure operation, run hdfs dfsadmin -reconfig datanode <dn_addr>:<ipc_port> status 5. to query a list reconfigurable properties on DN, run hdfs dfsadmin -reconfig datanode <dn_addr>:<ipc_port> properties

Add a new configuration “yarn.timeline-service.version” to indicate what is the current version of the running timeline service. For example, if “yarn.timeline-service.version” is 1.5, and “yarn.timeline-service.enabled” is true, it means the cluster will and should bring up the timeline service v.1.5. On the client side, if the client uses the same version of timeline service, it should succeed. If the client chooses to use a smaller version in spite of this, then depending on how robust the compatibility story is between versions, the results may vary.

Adds the ENDED attribute to o.a.h.yarn.api.records.FinalApplicationStatus

Added -skip-empty-file option to hadoop fs -getmerge command. With the option, delimiter (LF) is not printed for empty files even if -nl option is used.

This fix includes public method interface change. A follow-up JIRA issue for this incompatibility for branch-2.7 is HADOOP-13579.

libwebhdfs has been retired in 2.8.0 due to the lack of maintenance.

S3A has been made accessible through the FileContext API.

Make it configurable how long the cached du file is valid. Useful for rolling upgrade.

The Azure Blob Storage file system (WASB) now includes optional support for use of the append API by a single writer on a path. Please note that the implementation differs from the semantics of HDFS append. HDFS append internally guarantees that only a single writer may append to a path at a given time. WASB does not enforce this guarantee internally. Instead, the application must enforce access by a single writer, such as by running single-threaded or relying on some external locking mechanism to coordinate concurrent processes. Refer to the Azure Blob Storage documentation page for more details on enabling append in configuration.

If hadoop.token.files property is defined and configured to one or more comma-delimited delegation token files, Hadoop will use those token files to connect to the services as named in the token.

The patch replaces -namenode option with -fs for specifying the remote name node against which the benchmark is running. Before this patch, if ‘-namenode’ was not given, the benchmark would run in standalone mode, ignoring the ‘fs.defaultFS’ in config file even if it’s remote. With this patch, the benchmark, as other tools, will rely on the ‘fs.defaultFS’ config, which is overridable by -fs command option, to run standalone mode or remote mode.

Hadoop now includes a shell command named KDiag that helps with diagnosis of Kerberos misconfiguration problems. Please refer to the Secure Mode documentation for full details on usage of the command.

Made CanBuffer interface public for use in client applications.

The S3A Hadoop-compatible file system now support reading its S3 credentials from the Hadoop Credential Provider API in addition to XML configuration files.

WebHDFS now supports options to enforce cross-site request forgery (CSRF) prevention for HTTP requests to both the NameNode and the DataNode. Please refer to the updated WebHDFS documentation for a description of this feature and further details on how to configure it.

Added New compression levels for GzipCodec that can be set in zlib.compress.level

Default of ‘mapreduce.jobhistory.jhist.format’ property changed from ‘json’ to ‘binary’. Creates smaller, binary Avro .jhist files for faster JHS performance.

Number of blocks per volume is made available as a metric.

The Code Changes include following: - Modified DFSUtil.java in Apache HDFS project for supplying new parameter ssl.server.exclude.cipher.list - Modified HttpServer2.java in Apache Hadoop-common project to work with new parameter and exclude ciphers using jetty setExcludeCihers method. - Modfied associated test classes to owrk with existing code and also cover the newfunctionality in junit

The hadoop-azure file system now supports configuration of the Azure Storage account credentials using the standard Hadoop Credential Provider API. For details, please refer to the documentation on hadoop-azure and the Credential Provider API.

Audit logs will now only be generated in the following two cases: * When an operation results in an AccessControlException * When an operation is successful

Notably, this means audit log events will not be generated for exceptions besides AccessControlException.

Two recommendations for the mapreduce.jobhistory.loadedtasks.cache.size property: 1) For every 100k of cache size, set the heap size of the Job History Server to 1.2GB. For example, mapreduce.jobhistory.loadedtasks.cache.size=500000, heap size=6GB. 2) Make sure that the cache size is larger than the number of tasks required for the largest job run on the cluster. It might be a good idea to set the value slightly higher (say, 20%) in order to allow for job size growth.

Dependency on commons-httpclient::commons-httpclient was removed from hadoop-common. Downstream projects using commons-httpclient transitively provided by hadoop-common need to add explicit dependency to their pom. Since commons-httpclient is EOL, it is recommended to migrate to org.apache.httpcomponents:httpclient which is the successor.

This change contains the content of HADOOP-10115 which is an incompatible change.

HDFS-8791 introduces a new datanode layout format. This layout is identical to the previous block id based layout except it has a smaller 32x32 sub-directory structure in each data storage. On startup, the datanode will automatically upgrade it’s storages to this new layout. Currently, datanode layout changes support rolling upgrades, on the other hand downgrading is not supported between datanode layout changes and a rollback would be required.

Added new configuration options: dfs.webhdfs.socket.connect-timeout and dfs.webhdfs.socket.read-timeout both defaulting to 60s.

With the introduction of the markdown-formatted and automatically built changes file, the CHANGES.txt files have been eliminated.

This release adds a new feature called the DataNode Lifeline Protocol. If configured, then DataNodes can report that they are still alive to the NameNode via a fallback protocol, separate from the existing heartbeat messages. This can prevent the NameNode from incorrectly marking DataNodes as stale or dead in highly overloaded clusters where heartbeat processing is suffering delays. For more information, please refer to the hdfs-default.xml documentation for several new configuration properties: dfs.namenode.lifeline.rpc-address, dfs.namenode.lifeline.rpc-bind-host, dfs.datanode.lifeline.interval.seconds, dfs.namenode.lifeline.handler.ratio and dfs.namenode.lifeline.handler.count.

Fixed CgroupHandler’s creation and usage to avoid NodeManagers crashing when LinuxContainerExecutor is enabled.

Steps to reconfigure: 1. change value of the parameter in corresponding xml configuration file 2. to reconfigure, run hdfs dfsadmin -reconfig namenode <nn_addr>:<ipc_port> start 3. to check status of the most recent reconfigure operation, run hdfs dfsadmin -reconfig namenode <nn_addr>:<ipc_port> status 4. to query a list reconfigurable properties on NN, run hdfs dfsadmin -reconfig namenode <nn_addr>:<ipc_port> properties

Fix inconsistent value type ( String and Array ) of the “type” field for LeafQueueInfo in response of RM REST API

Makes the getFileChecksum API works with striped layout EC files. Checksum computation done by block level in the distributed fashion. The current API does not support to compare the checksum generated with normal file and the checksum generated for the same file but in striped layout.

DistCp in Hadoop 3.0 no longer supports -mapredSSLConf option. Use global ssl-client.xml configuration file for swebhdfs file systems instead.

Steps to reconfigure: 1. change value of the parameter in corresponding xml configuration file 2. to reconfigure, run hdfs dfsadmin -reconfig namenode <nn_addr>:<ipc_port> start 3. to check status of the most recent reconfigure operation, run hdfs dfsadmin -reconfig namenode <nn_addr>:<ipc_port> status 4. to query a list reconfigurable properties on NN, run hdfs dfsadmin -reconfig namenode <nn_addr>:<ipc_port> properties

On Unix platforms, HADOOP_PREFIX has been deprecated in favor of returning to HADOOP_HOME as in prior Apache Hadoop releases.

Removed FileUtil.copyMerge.

Backport the fix to 2.7 and 2.8

This new dfsadmin command, evictWriters, stops active block writing activities on a data node. The affected writes will continue without the node after a write pipeline recovery. This is useful when data node decommissioning is blocked by slow writers. If issued against a non-decommissioing data node, all current writers will be stopped, but new write requests will continue to be served.

Add new flag to allow supporting path style addressing for s3a

The default port for KMS service is now 9600. This is to avoid conflicts on the previous port 16000, which is also used by HMaster as the default port.

Skip blocks with size below dfs.balancer.getBlocks.min-block-size (default 10MB) when a balancer asks for a list of blocks.

Clusters cannot use FIFO policy as the defaultQueueSchedulingPolicy. Clusters with a single level of queues will have to explicitly set the policy to FIFO if that is desired.

The patch updates the HDFS default HTTP/RPC ports to non-ephemeral ports. The changes are listed below: Namenode ports: 50470 –> 9871, 50070 –> 9870, 8020 –> 9820 Secondary NN ports: 50091 –> 9869, 50090 –> 9868 Datanode ports: 50020 –> 9867, 50010 –> 9866, 50475 –> 9865, 50075 –> 9864

This patch will attempt to allocate all replicas to remote DataNodes, by adding local DataNode to the excluded DataNodes. If no sufficient replicas can be obtained, it will fall back to default block placement policy, which writes one replica to local DataNode.

This feature introduces a new command called “hadoop dtutil” which lets users request and download delegation tokens with certain attributes.

With this change, the .hadooprc file is now processed after Apache Hadoop has been fully bootstrapped. This allows for usage of the Apache Hadoop Shell API. A new file, .hadoop-env, now provides the ability for end users to override hadoop-env.sh.

LocalJobRunnerMetrics and ShuffleClientMetrics were updated to use Hadoop Metrics V2 framework.

Reserved space can be configured independently for different storage types for clusters with heterogeneous storage. The ‘dfs.datanode.du.reserved’ property name can be suffixed with a storage types (i.e. one of ssd, disk, archival or ram_disk). e.g. reserved space for RAM_DISK storage can be configured using the property ‘dfs.datanode.du.reserved.ram_disk’. If specific storage type reservation is not configured then the value specified by ‘dfs.datanode.du.reserved’ will be used for all volumes.

HDFS will create a “.Trash” subdirectory when creating a new encryption zone to support soft delete for files deleted within the encryption zone. A new “crypto -provisionTrash” command has been introduced to provision trash directories for encryption zones created with Apache Hadoop minor releases prior to 2.8.0.

The output of “hdfs oev -p stats” has changed. The option prints 0 instead of null for the count of the operations that have never been executed.

Remove invisible synchronization primitives from DataInputBuffer

S3A now includes the current Hadoop version in the User-Agent string passed through the AWS SDK to the S3 service. Users also may include optional additional information to identify their application. See the documentation of configuration property fs.s3a.user.agent.prefix for further details.

The minimum required JDK version for Hadoop has been increased from JDK7 to JDK8.

It is now possible to add or modify the behavior of existing subcommands in the hadoop, hdfs, mapred, and yarn scripts. See the Unix Shell Guide for more information.

If the user object returned by LDAP server has the user’s group object DN (supported by Active Directory), Hadoop can reduce LDAP group mapping latency by setting hadoop.security.group.mapping.ldap.search.attr.memberof to memberOf.

Users can integrate a custom credential provider with S3A. See documentation of configuration property fs.s3a.aws.credentials.provider for further details.

Before this fix, the files in .staging directory are always preserved when mapreduce.task.files.preserve.filepattern is set. After this fix, the files in .staging directory are preserved if the name of the directory matches the regex pattern specified by mapreduce.task.files.preserve.filepattern.

Introducing a new configuration “yarn.scheduler.fair.dynamic.max.assign” to dynamically determine the resources to assign per heartbeat when assignmultiple is turned on. When turned on, the scheduler allocates roughly half of the remaining resources overriding any max.assign settings configured. This is turned ON by default.

Exclude javadocs for proto-generated java classes.

This patch adds two new config keys for supporting timeouts in LDAP query operations. The property “hadoop.security.group.mapping.ldap.connection.timeout.ms” is the connection timeout (in milliseconds), within which period if the LDAP provider doesn’t establish a connection, it will abort the connect attempt. The property “hadoop.security.group.mapping.ldap.read.timeout.ms” is the read timeout (in milliseconds), within which period if the LDAP provider doesn’t get a LDAP response, it will abort the read attempt.

Enables renewal and cancellation of KMS delegation tokens. hadoop.security.key.provider.path needs to be configured to reach the key provider.

Adds support to S3AFileSystem for reading AWS credentials from environment variables.

Remove redundent TestMiniDFSCluster.testDualClusters to save time.

Two new configuration have been added “dfs.namenode.lease-recheck-interval-ms” and “dfs.namenode.max-lock-hold-to-release-lease-ms” to fine tune the duty cycle with which the Namenode recovers old leases.

S3A now supports read access to a public S3 bucket even if the client does not configure any AWS credentials. See the documentation of configuration property fs.s3a.aws.credentials.provider for further details.

S3A now supports use of AWS Security Token Service temporary credentials for authentication to S3. Refer to the documentation of configuration property fs.s3a.session.token for further details.

The hadoop-ant module in hadoop-tools has been removed.

Hadoop now supports integration with Azure Data Lake as an alternative Hadoop-compatible file system. Please refer to the Hadoop site documentation of Azure Data Lake for details on usage and configuration.

This rewrites the release process with a new dev-support/bin/create-release script. See http://wiki.apache.org/hadoop/HowToRelease for updated instructions on how to use it.

Allows userinfo component of URI authority to contain a slash (escaped as %2F). Especially useful for accessing AWS S3 with distcp or hadoop fs.

Adds support for Azure ActiveDirectory tokens using client ID and keys

Upgrading Jersey and its related libraries:

  1. Upgrading jersey from 1.9 to 1.19
  2. Adding jersey-servlet 1.19
  3. Upgrading grizzly-http-servlet from 2.1.2 to 2.2.21
  4. Adding grizzly-http 2.2.21
  5. Adding grizzly-http-server 2.2.21

After upgrading Jersey from 1.12 to 1.13, the root element whose content is empty collection is changed from null to empty object({}).

Add per-cache-pool default replication num configuration

S3A has added support for configurable input policies. Similar to fadvise, this configuration provides applications with a way to specify their expected access pattern (sequential or random) while reading a file. S3A then performs optimizations tailored to that access pattern. See site documentation of the fs.s3a.experimental.input.fadvise configuration property for more details. Please be advised that this feature is experimental and subject to backward-incompatible changes in future releases.

The Disk Balancer lets administrators rebalance data across multiple disks of a DataNode. It is useful to correct skewed data distribution often seen after adding or replacing disks. Disk Balancer can be enabled by setting dfs.disk.balancer.enabled to true in hdfs-site.xml. It can be invoked by running “hdfs diskbalancer”. See the “HDFS Diskbalancer” section in the HDFS Commands guide for detailed usage.

hadoop.security.groups.cache.background.reload can be set to true to enable background reload of expired groups cache entries. This setting can improve the performance of services that use Groups.java (e.g. the NameNode) when group lookups are slow. The setting is disabled by default.

DataNode Web UI has been improved with new HTML5 page, showing useful information.

The ‘slaves’ file has been deprecated in favor of the ‘workers’ file and, other than the deprecation warnings, all references to slavery have been removed from the source tree.

The default permissions of files and directories created via WebHDFS and HttpFS are now 644 and 755 respectively. See HDFS-10488 for related discussion.

The rcc command has been removed. See HADOOP-12485 where unused Hadoop Streaming classes were removed.

The s3 file system has been removed. The s3a file system should be used instead.

Upgrading following dependences: * Guice from 3.0 to 4.0 * cglib from 2.2 to 3.2.0 * asm from 3.2 to 5.0.4

This removes the configuration property {{dfs.client.use.legacy.blockreader}}, since the legacy remote block reader class has been removed from the codebase.

We are introducing an early preview (alpha 1) of a major revision of YARN Timeline Service: v.2. YARN Timeline Service v.2 addresses two major challenges: improving scalability and reliability of Timeline Service, and enhancing usability by introducing flows and aggregation.

YARN Timeline Service v.2 alpha 1 is provided so that users and developers can test it and provide feedback and suggestions for making it a ready replacement for Timeline Service v.1.x. It should be used only in a test capacity. Most importantly, security is not enabled. Do not set up or use Timeline Service v.2 until security is implemented if security is a critical requirement.

More details are available in the YARN Timeline Service v.2 documentation.

The time format of console logger and MapReduce job summary logger is ISO8601 by default to print milliseconds.

Dependencies on commons-httpclient have been removed. Projects with undeclared transitive dependencies on commons-httpclient, previously provided via hadoop-common or hadoop-client, may find this to be an incompatible change. Such project are also potentially exposed to the commons-httpclient CVE, and should be fixed for that reason as well.

The WASB FileSystem now uses version 4.2.0 of the Azure Storage SDK.

Add a configuration option to enable in-progress edit tailing and a related unit test

If the caller does not supply a permission, DFSClient#mkdirs and DFSClient#primitiveMkdir will create a new directory with the default directory permission 00777 now, instead of 00666.

Hdfs dfs chmod command will reset sticky bit permission on a file/directory when the leading sticky bit is omitted in the octal mode (like 644). So when a file/directory permission is applied using octal mode and sticky bit permission needs to be preserved, then it has to be explicitly mentioned in the permission bits (like 1644). This behavior is similar to many other filesystems on Linux/BSD.

WASB has added an optional capability to execute certain FileSystem operations in parallel on multiple threads for improved performance. Please refer to the Azure Blob Storage documentation page for more information on how to enable and control the feature.

It is now possible to specify multiple jar files for the libjars argument using a wildcard. For example, you can specify “-libjars ‘libs/*’” as a shorthand for all jars in the libs directory.

Added new plugin property yarn.nodemanager.disk-validator to allow the NodeManager to use an alternate class for checking whether a disk is good or not.

Previously, CallerContext was constructed by a builder. In this new pattern, the constructor is private so that caller context will always be constructed by a builder.

The output of hdfs fsck now also contains information about decommissioning replicas.

S3A has optimized the listFiles method by doing a bulk listing of all entries under a path in a single S3 operation instead of recursively walking the directory tree. The listLocatedStatus method has been optimized by fetching results from S3 lazily as the caller traverses the returned iterator instead of doing an eager fetch of all possible results.

S3A now supports configuration of multiple credential provider classes for authenticating to S3. These are loaded and queried in sequence for a valid set of credentials. For more details, refer to the description of the fs.s3a.aws.credentials.provider configuration property or the S3A documentation page.

Permissions are now checked when moving a file to Trash.

Unsupported FileSystem operations now throw an UnsupportedOperationException rather than an IOException.

Add a -x option for “hdfs -du” and “hdfs -count” commands to exclude snapshots from being calculated.

TrashPolicy#getInstance and initialize with Path were removed. Use the method without Path instead.

Prior to this fix, the NodeManager will ignore any non-zero exit code for any script in the yarn.nodemanager.health-checker.script.path property. With this change, any syntax errors in the health checking script will get flagged as an error in the same fashion (likely exit code 1) that the script detecting a health issue.

This change introduces a new configuration key used by RPC server to decide whether to send backoff signal to RPC Client when RPC call queue is full. When the feature is enabled, RPC server will no longer block on the processing of RPC requests when RPC call queue is full. It helps to improve quality of service when the service is under heavy load. The configuration key is in the format of “ipc.#port#.backoff.enable” where #port# is the port number that RPC server listens on. For example, if you want to enable the feature for the RPC server that listens on 8020, set ipc.8020.backoff.enable to true.

New fsck option “-upgradedomains” has been added to display upgrade domains of any block.

Add a new conf “dfs.balancer.max-size-to-move” so that Balancer.MAX_SIZE_TO_MOVE becomes configurable.

fsck does not print out dots for progress reporting by default. To print out dots, you should specify ‘-showprogress’ option.

This breaks rolling upgrades because it changes the major version of the NM state store schema. Therefore when a new NM comes up on an old state store it crashes.

The state store versions for this change have been updated in YARN-6798.