This page explains how to quickly setup HttpFS with Pseudo authentication against a Hadoop cluster with Pseudo authentication.
By default, HttpFS assumes that Hadoop configuration files (core-site.xml & hdfs-site.xml) are in the HttpFS configuration directory.
If this is not the case, add to the httpfs-site.xml file the httpfs.hadoop.config.dir property set to the location of the Hadoop configuration directory.
Edit Hadoop core-site.xml and defined the Unix user that will run the HttpFS server as a proxyuser. For example:
<property> <name>hadoop.proxyuser.#HTTPFSUSER#.hosts</name> <value>httpfs-host.foo.com</value> </property> <property> <name>hadoop.proxyuser.#HTTPFSUSER#.groups</name> <value>*</value> </property>
IMPORTANT: Replace #HTTPFSUSER# with the Unix user that will start the HttpFS server.
To start/stop HttpFS use HttpFS’s sbin/httpfs.sh script. For example:
$ sbin/httpfs.sh start
NOTE: Invoking the script without any parameters list all possible parameters (start, stop, run, etc.). The httpfs.sh script is a wrapper for Tomcat’s catalina.sh script that sets the environment variables and Java System properties required to run HttpFS server.
$ curl -sS 'http://<HTTPFSHOSTNAME>:14000/webhdfs/v1?op=gethomedirectory&user.name=hdfs' {"Path":"\/user\/hdfs"}
To configure the embedded Tomcat go to the tomcat/conf.
HttpFS preconfigures the HTTP and Admin ports in Tomcat’s server.xml to 14000 and 14001.
Tomcat logs are also preconfigured to go to HttpFS’s logs/ directory.
HttpFS default value for the maxHttpHeaderSize parameter in Tomcat’s server.xml is set to 65536 by default.
The following environment variables (which can be set in HttpFS’s etc/hadoop/httpfs-env.sh script) can be used to alter those values:
HTTPFS_HTTP_PORT
HTTPFS_ADMIN_PORT
HADOOP_LOG_DIR
HTTPFS_MAX_HTTP_HEADER_SIZE
HttpFS supports the following configuration properties in the HttpFS’s etc/hadoop/httpfs-site.xml configuration file.
To configure HttpFS to work over SSL edit the httpfs-env.sh script in the configuration directory setting the HTTPFS_SSL_ENABLED to true.
In addition, the following 2 properties may be defined (shown with default values):
HTTPFS_SSL_KEYSTORE_FILE=$HOME/.keystore
HTTPFS_SSL_KEYSTORE_PASS=password
In the HttpFS tomcat/conf directory, replace the server.xml file with the ssl-server.xml file.
You need to create an SSL certificate for the HttpFS server. As the httpfs Unix user, using the Java keytool command to create the SSL certificate:
$ keytool -genkey -alias tomcat -keyalg RSA
You will be asked a series of questions in an interactive prompt. It will create the keystore file, which will be named .keystore and located in the httpfs user home directory.
The password you enter for “keystore password” must match the value of the HTTPFS_SSL_KEYSTORE_PASS environment variable set in the httpfs-env.sh script in the configuration directory.
The answer to “What is your first and last name?” (i.e. “CN”) must be the hostname of the machine where the HttpFS Server will be running.
Start HttpFS. It should work over HTTPS.
Using the Hadoop FileSystem API or the Hadoop FS shell, use the swebhdfs:// scheme. Make sure the JVM is picking up the truststore containing the public key of the SSL certificate if using a self-signed certificate.
Set environment variable HTTPFS_SSL_CLIENT_AUTH to change client authentication. The default is false. See clientAuth in Tomcat 6.0 SSL Support.
Set environment variable HTTPFS_SSL_ENABLED_PROTOCOLS to specify a list of enabled SSL protocols. The default list includes TLSv1, TLSv1.1, TLSv1.2, and SSLv2Hello. See sslEnabledProtocols in Tomcat 6.0 SSL Support.
In order to support some old SSL clients, the default encryption ciphers include a few relatively weaker ciphers. Set environment variable HTTPFS_SSL_CIPHERS to override. The value is a comma separated list of ciphers in Tomcat Wiki.