Native Libraries Guide


This guide describes the native hadoop library and includes a small discussion about native shared libraries.

Note: Depending on your environment, the term “native libraries” could refer to all *.so’s you need to compile; and, the term “native compression” could refer to all *.so’s you need to compile that are specifically related to compression. Currently, however, this document only addresses the native hadoop library ( The document for libhdfs library ( is here.

Native Hadoop Library

Hadoop has native implementations of certain components for performance reasons and for non-availability of Java implementations. These components are available in a single, dynamically-linked native library called the native hadoop library. On the *nix platforms the library is named


It is fairly easy to use the native hadoop library:

  1. Review the components.
  2. Review the supported platforms.
  3. Either download a hadoop release, which will include a pre-built version of the native hadoop library, or build your own version of the native hadoop library. Whether you download or build, the name for the library is the same:
  4. Install the compression codec development packages (>zlib-1.2, >gzip-1.2):
    • If you download the library, install one or more development packages - whichever compression codecs you want to use with your deployment.
    • If you build the library, it is mandatory to install both development packages.
  5. Check the runtime log files.


The native hadoop library includes various components:

Supported Platforms

The native hadoop library is supported on *nix platforms only. The library does not to work with Cygwin or the Mac OS X platform.

The native hadoop library is mainly used on the GNU/Linus platform and has been tested on these distributions:

  • RHEL4/Fedora
  • Ubuntu
  • Gentoo

On all the above distributions a 32/64 bit native hadoop library will work with a respective 32/64 bit jvm.


The pre-built 32-bit i386-Linux native hadoop library is available as part of the hadoop distribution and is located in the lib/native directory. You can download the hadoop distribution from Hadoop Common Releases.

Be sure to install the zlib and/or gzip development packages - whichever compression codecs you want to use with your deployment.


The native hadoop library is written in ANSI C and is built using the GNU autotools-chain (autoconf, autoheader, automake, autoscan, libtool). This means it should be straight-forward to build the library on any platform with a standards-compliant C compiler and the GNU autotools-chain (see the supported platforms).

The packages you need to install on the target platform are:

  • C compiler (e.g. GNU C Compiler)
  • GNU Autools Chain: autoconf, automake, libtool
  • zlib-development package (stable version >= 1.2.0)
  • openssl-development package(e.g. libssl-dev)

Once you installed the prerequisite packages use the standard hadoop pom.xml file and pass along the native flag to build the native hadoop library:

   $ mvn package -Pdist,native -DskipTests -Dtar

You should see the newly-built library in:

   $ hadoop-dist/target/hadoop-2.7.1/lib/native

Please note the following:

  • It is mandatory to install both the zlib and gzip development packages on the target platform in order to build the native hadoop library; however, for deployment it is sufficient to install just one package if you wish to use only one codec.
  • It is necessary to have the correct 32/64 libraries for zlib, depending on the 32/64 bit jvm for the target platform, in order to build and deploy the native hadoop library.


The bin/hadoop script ensures that the native hadoop library is on the library path via the system property: -Djava.library.path=<path>

During runtime, check the hadoop log files for your MapReduce tasks.

  • If everything is all right, then: DEBUG util.NativeCodeLoader - Trying to load the custom-built native-hadoop library... INFO util.NativeCodeLoader - Loaded the native-hadoop library
  • If something goes wrong, then: INFO util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


NativeLibraryChecker is a tool to check whether native libraries are loaded correctly. You can launch NativeLibraryChecker as follows:

   $ hadoop checknative -a
   14/12/06 01:30:45 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
   14/12/06 01:30:45 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
   Native library checking:
   hadoop: true /home/ozawa/hadoop/lib/native/
   zlib:   true /lib/x86_64-linux-gnu/
   snappy: true /usr/lib/
   lz4:    true revision:99
   bzip2:  false

Native Shared Libraries

You can load any native shared library using DistributedCache for distributing and symlinking the library files.

This example shows you how to distribute a shared library,, and load it from a MapReduce task.

  1. First copy the library to the HDFS: bin/hadoop fs -copyFromLocal /libraries/
  2. The job launching program should contain the following: DistributedCache.createSymlink(conf); DistributedCache.addCacheFile("hdfs://host:port/libraries/", conf);
  3. The MapReduce task can contain: System.loadLibrary("");

Note: If you downloaded or built the native hadoop library, you don’t need to use DistibutedCache to make the library available to your MapReduce tasks.