namevaluedescription
dfs.namenode.logging.levelinfoThe logging level for dfs namenode. Other values are "dir"(trac e namespace mutations), "block"(trace block under/over replications and block creations/deletions), or "all".
dfs.namenode.rpc-address RPC address that handles all clients requests. If empty then we'll get the value from fs.default.name. The value of this property will take the form of hdfs://nn-host1:rpc-port.
dfs.secondary.http.address0.0.0.0:50090 The secondary namenode http server address and port. If the port is 0 then the server will start on a free port.
dfs.datanode.address0.0.0.0:50010 The datanode server address and port for data transfer. If the port is 0 then the server will start on a free port.
dfs.datanode.http.address0.0.0.0:50075 The datanode http server address and port. If the port is 0 then the server will start on a free port.
dfs.datanode.ipc.address0.0.0.0:50020 The datanode ipc server address and port. If the port is 0 then the server will start on a free port.
dfs.datanode.handler.count3The number of server threads for the datanode.
dfs.http.address0.0.0.0:50070 The address and the base port where the dfs namenode web ui will listen on. If the port is 0 then the server will start on a free port.
dfs.https.enablefalseDecide if HTTPS(SSL) is supported on HDFS
dfs.https.need.client.authfalseWhether SSL client certificate authentication is required
dfs.https.server.keystore.resourcessl-server.xmlResource file from which ssl server keystore information will be extracted
dfs.https.client.keystore.resourcessl-client.xmlResource file from which ssl client keystore information will be extracted
dfs.datanode.https.address0.0.0.0:50475
dfs.https.address0.0.0.0:50470
dfs.datanode.dns.interfacedefaultThe name of the Network Interface from which a data node should report its IP address.
dfs.datanode.dns.nameserverdefaultThe host name or IP address of the name server (DNS) which a DataNode should use to determine the host name used by the NameNode for communication and display purposes.
dfs.replication.considerLoadtrueDecide if chooseTarget considers the target's load or not
dfs.default.chunk.view.size32768The number of bytes to view for a file on the browser.
dfs.datanode.du.reserved0Reserved space in bytes per volume. Always leave this much space free for non dfs use.
dfs.name.dir${hadoop.tmp.dir}/dfs/nameDetermines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
dfs.name.edits.dir${dfs.name.dir}Determines where on the local filesystem the DFS name node should store the transaction (edits) file. If this is a comma-delimited list of directories then the transaction file is replicated in all of the directories, for redundancy. Default value is same as dfs.name.dir
dfs.namenode.edits.toleration.length0 The length in bytes that namenode is willing to tolerate when the edit log is corrupted. The edit log toleration feature checks the entire edit log. It computes read length (the length of valid data), corruption length and padding length. In case that corruption length is non-zero, the corruption will be tolerated only if the corruption length is less than or equal to the toleration length. For disabling edit log toleration feature, set this property to -1. When the feature is disabled, the end of edit log will not be checked. In this case, namenode will startup normally even if the end of edit log is corrupted.
dfs.web.ugiwebuser,webgroupThe user account used by the web interface. Syntax: USERNAME,GROUP1,GROUP2, ...
dfs.permissionstrue If "true", enable permission checking in HDFS. If "false", permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories.
dfs.permissions.supergroupsupergroupThe name of the group of super-users.
dfs.block.access.token.enablefalse If "true", access tokens are used as capabilities for accessing datanodes. If "false", no access tokens are checked on accessing datanodes.
dfs.block.access.key.update.interval600 Interval in minutes at which namenode updates its access keys.
dfs.block.access.token.lifetime600The lifetime of access tokens in minutes.
dfs.data.dir${hadoop.tmp.dir}/dfs/dataDetermines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.
dfs.datanode.data.dir.perm755Permissions for the directories on on the local filesystem where the DFS data node store its blocks. The permissions can either be octal or symbolic.
dfs.replication3Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.
dfs.replication.max512Maximal block replication.
dfs.replication.min1Minimal block replication.
dfs.block.size67108864The default block size for new files.
dfs.df.interval60000Disk usage statistics refresh interval in msec.
dfs.client.block.write.retries3The number of retries for writing blocks to the data nodes, before we signal failure to the application.
dfs.blockreport.intervalMsec3600000Determines block reporting interval in milliseconds.
dfs.blockreport.initialDelay0Delay for first block report in seconds.
dfs.heartbeat.interval3Determines datanode heartbeat interval in seconds.
dfs.namenode.handler.count10The number of server threads for the namenode.
dfs.safemode.threshold.pct0.999f Specifies the percentage of blocks that should satisfy the minimal replication requirement defined by dfs.replication.min. Values less than or equal to 0 mean not to wait for any particular percentage of blocks before exiting safemode. Values greater than 1 will make safe mode permanent.
dfs.namenode.safemode.min.datanodes0 Specifies the number of datanodes that must be considered alive before the name node exits safemode. Values less than or equal to 0 mean not to take the number of live datanodes into account when deciding whether to remain in safe mode during startup. Values greater than the number of datanodes in the cluster will make safe mode permanent.
dfs.safemode.extension30000 Determines extension of safe mode in milliseconds after the threshold level is reached.
dfs.balance.bandwidthPerSec1048576 Specifies the maximum amount of bandwidth that each datanode can utilize for the balancing purpose in term of the number of bytes per second.
dfs.hostsNames a file that contains a list of hosts that are permitted to connect to the namenode. The full pathname of the file must be specified. If the value is empty, all hosts are permitted.
dfs.hosts.excludeNames a file that contains a list of hosts that are not permitted to connect to the namenode. The full pathname of the file must be specified. If the value is empty, no hosts are excluded.
dfs.max.objects0The maximum number of files, directories and blocks dfs supports. A value of zero indicates no limit to the number of objects that dfs supports.
dfs.namenode.decommission.interval30Namenode periodicity in seconds to check if decommission is complete.
dfs.namenode.decommission.nodes.per.interval5The number of nodes namenode checks if decommission is complete in each dfs.namenode.decommission.interval.
dfs.replication.interval3The periodicity in seconds with which the namenode computes repliaction work for datanodes.
dfs.access.time.precision3600000The access time for HDFS file is precise upto this value. The default value is 1 hour. Setting a value of 0 disables access times for HDFS.
dfs.support.append This option is no longer supported. HBase no longer requires that this option be enabled as sync is now enabled by default. See HADOOP-8230 for additional information.
dfs.namenode.delegation.key.update-interval86400000The update interval for master key for delegation tokens in the namenode in milliseconds.
dfs.namenode.delegation.token.max-lifetime604800000The maximum lifetime in milliseconds for which a delegation token is valid.
dfs.namenode.delegation.token.renew-interval86400000The renewal interval for delegation token in milliseconds.
dfs.datanode.failed.volumes.tolerated0The number of volumes that are allowed to fail before a datanode stops offering service. By default any volume failure will cause a datanode to shutdown.
dfs.datanode.max.xcievers4096Specifies the maximum number of threads to use for transferring data in and out of the DN.
dfs.datanode.readahead.bytes4193404 While reading block files, if the Hadoop native libraries are available, the datanode can use the posix_fadvise system call to explicitly page data into the operating system buffer cache ahead of the current reader's position. This can improve performance especially when disks are highly contended. This configuration specifies the number of bytes ahead of the current read position which the datanode will attempt to read ahead. This feature may be disabled by configuring this property to 0. If the native libraries are not available, this configuration has no effect.
dfs.datanode.drop.cache.behind.readsfalse In some workloads, the data read from HDFS is known to be significantly large enough that it is unlikely to be useful to cache it in the operating system buffer cache. In this case, the DataNode may be configured to automatically purge all data from the buffer cache after it is delivered to the client. This behavior is automatically disabled for workloads which read only short sections of a block (e.g HBase random-IO workloads). This may improve performance for some workloads by freeing buffer cache spage usage for more cacheable data. If the Hadoop native libraries are not available, this configuration has no effect.
dfs.datanode.drop.cache.behind.writesfalse In some workloads, the data written to HDFS is known to be significantly large enough that it is unlikely to be useful to cache it in the operating system buffer cache. In this case, the DataNode may be configured to automatically purge all data from the buffer cache after it is written to disk. This may improve performance for some workloads by freeing buffer cache spage usage for more cacheable data. If the Hadoop native libraries are not available, this configuration has no effect.
dfs.datanode.sync.behind.writesfalse If this configuration is enabled, the datanode will instruct the operating system to enqueue all written data to the disk immediately after it is written. This differs from the usual OS policy which may wait for up to 30 seconds before triggering writeback. This may improve performance for some workloads by smoothing the IO profile for data written to disk. If the Hadoop native libraries are not available, this configuration has no effect.
dfs.client.use.datanode.hostnamefalseWhether clients should use datanode hostnames when connecting to datanodes.
dfs.datanode.use.datanode.hostnamefalseWhether datanodes should use datanode hostnames when connecting to other datanodes for data transfer.
dfs.client.local.interfacesA comma separated list of network interface names to use for data transfer between the client and datanodes. When creating a connection to read from or write to a datanode, the client chooses one of the specified interfaces at random and binds its socket to the IP of that interface. Individual names may be specified as either an interface name (eg "eth0"), a subinterface name (eg "eth0:0"), or an IP address (which may be specified using CIDR notation to match a range of IPs).
dfs.image.transfer.bandwidthPerSec0 Specifies the maximum amount of bandwidth that can be utilized for image transfer in term of the number of bytes per second. A default value of 0 indicates that throttling is disabled.
dfs.webhdfs.enabledfalse Enable WebHDFS (REST API) in Namenodes and Datanodes.
dfs.namenode.kerberos.internal.spnego.principal${dfs.web.authentication.kerberos.principal}
dfs.secondary.namenode.kerberos.internal.spnego.principal${dfs.web.authentication.kerberos.principal}
dfs.namenode.invalidate.work.pct.per.iteration0.32f *Note*: Advanced property. Change with caution. This determines the percentage amount of block invalidations (deletes) to do over a single DN heartbeat deletion command. The final deletion count is determined by applying this percentage to the number of live nodes in the system. The resultant number is the number of blocks from the deletion list chosen for proper invalidation over a single heartbeat of a single DN. Value should be a positive, non-zero percentage in float notation (X.Yf), with 1.0f meaning 100%.
dfs.namenode.replication.work.multiplier.per.iteration2 *Note*: Advanced property. Change with caution. This determines the total amount of block transfers to begin in parallel at a DN, for replication, when such a command list is being sent over a DN heartbeat by the NN. The actual number is obtained by multiplying this multiplier with the total number of live nodes in the cluster. The result number is the number of blocks to begin transfers immediately for, per DN heartbeat. This number can be any positive, non-zero integer.
dfs.namenode.avoid.read.stale.datanodefalse Indicate whether or not to avoid reading from "stale" datanodes whose heartbeat messages have not been received by the namenode for more than a specified time interval. Stale datanodes will be moved to the end of the node list returned for reading. See dfs.namenode.avoid.write.stale.datanode for a similar setting for writes.
dfs.namenode.avoid.write.stale.datanodefalse Indicate whether or not to avoid writing to "stale" datanodes whose heartbeat messages have not been received by the namenode for more than a specified time interval. Writes will avoid using stale datanodes unless more than a configured ratio (dfs.namenode.write.stale.datanode.ratio) of datanodes are marked as stale. See dfs.namenode.avoid.read.stale.datanode for a similar setting for reads.
dfs.namenode.stale.datanode.interval30000 Default time interval for marking a datanode as "stale", i.e., if the namenode has not received heartbeat msg from a datanode for more than this time interval, the datanode will be marked and treated as "stale" by default. The stale interval cannot be too small since otherwise this may cause too frequent change of stale states. We thus set a minimum stale interval value (the default value is 3 times of heartbeat interval) and guarantee that the stale interval cannot be less than the minimum value.
dfs.namenode.write.stale.datanode.ratio0.5f When the ratio of number stale datanodes to total datanodes marked is greater than this ratio, stop avoiding writing to stale nodes so as to prevent causing hotspots.
dfs.datanode.pluginsComma-separated list of datanode plug-ins to be activated.
dfs.namenode.pluginsComma-separated list of namenode plug-ins to be activated.