Dockerfile to run Tensorflow on YARN need two part:
Base libraries which Tensorflow depends on
1) OS base image, for example ubuntu:16.04
2) Tensorflow depended libraries and packages. For example python, scipy. For GPU support, need cuda, cudnn, etc.
3) Tensorflow package.
Libraries to access HDFS
1) JDK
2) Hadoop
Here’s an example of a base image (w/o GPU support) to install Tensorflow:
FROM ubuntu:16.04 # Pick up some TF dependencies RUN apt-get update && apt-get install -y --no-install-recommends \ build-essential \ curl \ libfreetype6-dev \ libpng12-dev \ libzmq3-dev \ pkg-config \ python \ python-dev \ rsync \ software-properties-common \ unzip \ && \ apt-get clean && \ rm -rf /var/lib/apt/lists/* RUN curl -O https://bootstrap.pypa.io/get-pip.py && \ python get-pip.py && \ rm get-pip.py RUN pip --no-cache-dir install \ Pillow \ h5py \ ipykernel \ jupyter \ matplotlib \ numpy \ pandas \ scipy \ sklearn \ && \ python -m ipykernel.kernelspec RUN pip --no-cache-dir install \ http://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.8.0-cp27-none-linux_x86_64.whl
On top of above image, add files, install packages to access HDFS
RUN apt-get update && apt-get install -y openjdk-8-jdk wget RUN wget http://apache.cs.utah.edu/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz RUN tar zxf hadoop-3.1.0.tar.gz
Build and push to your own docker registry: Use docker build ... and docker push ... to finish this step.
We provided following examples for you to build tensorflow docker images.
For Tensorflow 1.8.0 (Precompiled to CUDA 9.x)
Under docker/ directory, run build-all.sh to build Docker images. It will build following images: