You might want to use Dockerfile instructions to install packages directly to your environment. This is useful if your package installation takes a long time. Installations in a Domino environment are cached, so you won’t have to wait for the package to install every time. If you already have a Docker image you’d like to use, enter it in the Base Environment field. If you don’t set this, the Domino default environment is used as your base image.
Consult the official Docker documentation to learn more about Dockerfiles:
Do not start your Dockerfile instructions in Domino with a
Domino includes the
FROM line for you, pointing to the base image specified when setting up the environment.
The most common Dockerfile instructions you’ll use are
RUN commands execute lines of bash.
USER root RUN wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz RUN tar xvzf spark-1.5.1-bin-hadoop2.6.tgz RUN mv spark-1.5.1-bin-hadoop2.6 /opt RUN rm spark-1.5.1-bin-hadoop2.6.tgz USER ubuntu
ARG commands set build-time variables, and
ENV commands set container bash environment variables.
They will be accessible from runs that use this environment.
ENV SPARK_HOME /opt/spark-1.5.1-bin-hadoop2.6
If you set variables in the Environment variables section of your definition, you can use an
This will be available for the build step.
If you want the variable to be available in the final compute environment you must add an
ENV statement referencing the argument name:
Examples: Package Installation
When you edit your environment, click R Package or Python Package to insert a line to install packages.
Enter the names of the packages.
You can also add the commands, as in the following examples:
R Package Installation, example with the devtools package.
USER root RUN R --no-save -e "install.packages('devtools')" USER ubuntu
Python Package Installation with Pip, for example with the… numpy package.
USER root RUN pip install numpy USER ubuntu
Docker optimizes its build process by keeping track of commands it has run and aggressively caching the results. This means that if it sees the same set of commands as in a previous build, it will assume that it can use the cached version. A single new command will invalidate the caching of all subsequent commands.
A Docker image can have up to 127 layers, or commands. To work around this limit, you can use
&&: to combine several commands into a single command.
RUN \ wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz && \ tar xvzf spark-1.5.1-bin-hadoop2.6.tgz && \ mv spark-1.5.1-bin-hadoop2.6 /opt && \ rm spark-1.5.1-bin-hadoop2.6.tgz
If you are installing multiple python packages via pip, it’s almost always best to use a single pip install command. This ensures that dependencies and package versions are properly resolved. If you install via separate commands, you may end up inadvertently overriding a package with the wrong version, due a dependency specified by a later installation. For example:
RUN pip install luigi nolearn lasagne