Chapter 4. Working with Docker Images

Every Docker container is based on an image, which provides the basis for everything that you will ever deploy and run with Docker. To launch a container, you must either download a public image or create your own. Every Docker image consists of one or more filesystem layers that generally have a direct one-to-one mapping to each individual build step used to create that image.

For image management, Docker relies heavily on its storage backend, which communicates with the underlying Linux filesystem to build and manage the multiple layers that combine into a single usable image. The primary storage backends that are supported include: AUFS, BTRFS, Device-mapper, and overlayfs. Each storage backend provides a fast copy-on-write (CoW) system for image management.

Anatomy of a Dockerfile

To create a custom Docker image with the default tools, you will need to become familiar with the Dockerfile. This file describes all the steps that are required to create one image and would usually be contained within the root directory of the source code repository for your application.

A typical Dockerfile might look something like the one shown here, which will create a container for a Node.js-based application:

FROM node:0.10

MAINTAINER Anna Doe <anna@example.com>

LABEL "rating"="Five Stars" "class"="First Class"

USER root

ENV AP /data/app
ENV SCPATH /etc/supervisor/conf.d

RUN apt-get -y update

# The daemons
RUN apt-get -y install supervisor
RUN mkdir -p /var/log/supervisor

# Supervisor Configuration
ADD ./supervisord/conf.d/* $SCPATH/

# Application Code
ADD *.js* $AP/

WORKDIR $AP

RUN npm install

CMD ["supervisord", "-n"]

Dissecting this Dockerfile will provide some initial exposure to a number of the possible instructions that you can use to control how an image is assembled. Each line in a Dockerfile creates a new image layer that is stored by Docker. This means that when you build new images, Docker will only need to build layers that deviate from previous builds.

Although you could build a Node instance from a plain, base Linux image, you can also explore the Docker Registry for official images for Node. Node.js maintains a series of Docker images and tags that allows you to quickly determine that you should tell the image to inherit from node:0.10, which will pull the most recent Node.js version 0.10 container. If you want to lock the image to a specific version of Node, you could instead point it at node:0.10.33. The base image that follows will provide you with an Ubuntu Linux image running Node 0.10.x:

FROM node:0.10

The MAINTAINER field provides contact information for the Dockerfile’s author, which populates the Author field in all resulting images’ metadata:

MAINTAINER Anna Doe <anna@example.com>

The ability to apply labels to images and containers was added to Docker in version 1.6. This means that you can now add metadata via key-value pairs that can later be used to search for and identify Docker images and containers. You can see the labels applied to any image using the docker inspect command:

LABEL "rating"="Five Stars" "class"="First Class"

By default, Docker runs all processes as root within the container, but you can use the USER instruction to change this:

USER root
Caution

Even though containers provide some isolation from the underlying operating system, they still run on the host kernel. Due to potential security risks, production containers should almost always be run under the context of a non-privileged user.

The ENV instruction allows you to set shell variables that can be used during the build process to simplify the Dockerfile and help keep it DRYer:1

ENV AP /data/app
ENV SCPATH /etc/supervisor/conf.d

In the following code, you’ll use a collection of RUN instructions to start and create the required file structure that you need, and install some required software dependencies. You’ll also start to use the build variables you defined in the previous section to save you a bit of work and help protect you from typos:

RUN apt-get -y update

# The daemons
RUN apt-get -y install supervisor
RUN mkdir -p /var/log/supervisor
Warning

It is generally considered a bad idea to run commands like apt-get -y update or yum -y update in your application Dockerfiles because it can significantly increase the time it takes for all of your builds to finish. Instead, consider basing your application image on another image that already has these updates applied to it.

Note

Remember that every instruction creates a new Docker image layer, so it often makes sense to combine a few logically grouped commands onto a single line. It is even possible to use the ADD instruction in combination with the RUN instruction to copy a complex script to your image and then execute that script with only two commands in the Dockerfile.

The ADD instruction is used to copy files from the local filesystem into your image. Most often this will include your application code and any required support files:

# Supervisor Configuration
ADD ./supervisord/conf.d/* $SCPATH/

# Application Code
ADD *.js* $AP/
Note

ADD allows you to include files from the local filesystem into the image. However, once the image is built, you can use the image without having access to the original files because they have been copied into the image.

With the WORKDIR instruction, you change the working directory in the image for the remaining build instructions:

WORKDIR $AP

RUN npm install
Caution

The order of commands in a Dockerfile can have a very significant impact on ongoing build times. You should try to order commands so that things that change between every single build are closer to the bottom. This means that adding your code and similar steps should be held off until the end. When you rebuild an image, every single layer after the first introduced change will need to be rebuilt.

And finally you end with the CMD instruction, which defines the command that launches the process that you want to run within the container:

CMD ["supervisord", "-n"]
Note

It is generally considered best practice to only run a single process within a container, although there is debate about this within the community. The idea is that a container should provide a single function, so that it remains easy to horizontally scale individual functions within your architecture. In the example, you are using supervisord to manage the node application and ensure that it stays running within the container.

Building an Image

To build your first image, let’s go ahead and clone a git repo that contains an example application called docker-node-hello, as shown here:2

$ git clone https://github.com/spkane/docker-node-hello.git
Cloning into 'docker-node-hello'...
remote: Counting objects: 20, done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 20 (delta 6), reused 20 (delta 6)
Unpacking objects: 100% (20/20), done.
Checking connectivity... done.
$ cd docker-node-hello
Note

Git is frequently installed on Linux and Mac OS X systems, but if you do not already have git available, you can download a simple installer from git-scm.com.

This will download a working Dockerfile and related source code files into a directory called docker-node-hello. If you look at the contents while ignoring the git repo directory, you should see the following:

$ tree -a -I .git
.
├── .dockerignore
├── .gitignore
├── Dockerfile
├── Makefile
├── README.md
├── Vagrantfile
├── index.js
├── package.json
└── supervisord
    └── conf.d
        ├── node.conf
        └── supervisord.conf

Let’s review the most relevant files in the repo.

The Dockerfile should be exactly the same as the one you just reviewed.

The .dockerignore file allows you to define files and directories that you do not want uploaded to the Docker host when you are building the image. In this instance, the .dockerignore file contains the following line:

.git

This instructs docker build to exclude the .git directory, which contains the whole source code repository. You do not need this directory to build the Docker image, and since it can grow quite large over time, you don’t want to waste time copying it every time you do a build.

Note

The .git directory contains git configuration data and every single change that you have ever made to your code. The rest of the files reflect the current state of your source code. This is why we can safely tell Docker to ignore the .git directory.

  • package.json defines the Node.js application and lists any dependencies that it relies on.

  • index.js is the main source code for the application.

The supervisord directory contains the configuration files for supervisord that you will need to start and monitor the application.

Note

Using supervisord in this example to monitor the application is overkill, but it is intended to provide a bit of insight into some of the techniques you can use in a container to provide more control over your application and its running state.

As we discussed in Chapter 3, you will need to have your Docker server running and your client properly set up to communicate with it before you can build a Docker image. Assuming that this is all working, you should be able to initiate a new build by running the command below, which will build and tag an image based on the files in the current directory.

Note

The first build that you run will take a few minutes because you have to download the base node image. Subsequent builds should be much faster unless a newer node 0.10 base image has been released.

Each step identified in the following output maps directly to a line in the Dockerfile, and each step creates a new image layer based on the previous step:

$ docker build -t example/docker-node-hello:latest .
Sending build context to Docker daemon 16.38 kB
Sending build context to Docker daemon
Step 0 : FROM node:0.10
node:0.10: The image you are pulling has been verified
511136ea3c5a: Pull complete
36fd425d7d8a: Pull complete
aaabd2b41e22: Pull complete
3c20e07c38ce: Pull complete
b6ef456c239c: Pull complete
b045b0cd49ad: Pull complete
210d9bc26f2f: Pull complete
27ecce8bd36c: Pull complete
fcac83abd52d: Pull complete
edc7d098628f: Pull complete
b5ac041b53f9: Pull complete
387247331d9c: Pull complete
Status: Downloaded newer image for node:0.10
 ---> 387247331d9c
Step 1 : MAINTAINER Anna Doe <anna@example.com>
 ---> Running in fd83efd2ecbd
 ---> a479befa0788
Removing intermediate container fd83efd2ecbd
Step 2 : LABEL "rating"="Five Stars" "class"="First Class"
 ---> Running in 30acbe0f1379
 ---> 3cbea27e857c
Removing intermediate container 30acbe0f1379
Step 3 : USER root
 ---> Running in 32dfbc0f0855
 ---> 9fada51b938d
Removing intermediate container 32dfbc0f0855
Step 4 : ENV AP /data/app
 ---> Running in 0e04f129d7f5
 ---> 818dafcc487a
Removing intermediate container 0e04f129d7f5
Step 5 : ENV SCPATH /etc/supervisor/conf.d
 ---> Running in f828cccc5038
 ---> b5f3a2dbc1a2
Removing intermediate container f828cccc5038
Step 6 : RUN apt-get -y update
 ---> Running in 51e0d361adfe
Get:1 http://security.debian.org jessie/updates InRelease [84.1 kB]
Get:2 http://http.debian.net jessie InRelease [191 kB]
Get:3 http://security.debian.org jessie/updates/main amd64 Packages [20 B]
Get:4 http://http.debian.net jessie-updates InRelease [117 kB]
Get:5 http://http.debian.net jessie/main amd64 Packages [9103 kB]
Get:6 http://http.debian.net jessie-updates/main amd64 Packages [20 B]
Fetched 9496 kB in 7s (1232 kB/s)
Reading package lists...
W: Size of file /var/lib/... is not what the server reported 9102955 9117278
 ---> 16c8639b44c9
Removing intermediate container 51e0d361adfe
Step 7 : RUN apt-get -y install supervisor
 ---> Running in fa79bc727362
Reading package lists...
Building dependency tree...
Reading state information...
The following extra packages will be installed:
  python-meld3
The following NEW packages will be installed:
  python-meld3 supervisor
0 upgraded, 2 newly installed, 0 to remove and 96 not upgraded.
Need to get 304 kB of archives.
After this operation, 1483 kB of additional disk space will be used.
Get:1 http://.../debian/ jessie/main python-meld3 amd64 1.0.0-1 [37.0 kB]
Get:2 http://.../debian/ jessie/main supervisor all 3.0r1-1 [267 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 304 kB in 1s (232 kB/s)
Selecting previously unselected package python-meld3.
(Reading database ... 29248 files and directories currently installed.)
Preparing to unpack .../python-meld3_1.0.0-1_amd64.deb ...
Unpacking python-meld3 (1.0.0-1) ...
Selecting previously unselected package supervisor.
Preparing to unpack .../supervisor_3.0r1-1_all.deb ...
Unpacking supervisor (3.0r1-1) ...
Setting up python-meld3 (1.0.0-1) ...
Setting up supervisor (3.0r1-1) ...
invoke-rc.d: policy-rc.d denied execution of start.
 ---> eabf485da230
Removing intermediate container fa79bc727362
Step 8 : RUN mkdir -p /var/log/supervisor
 ---> Running in 0bf6264625dd
 ---> 4bcba91d84e1
Removing intermediate container 0bf6264625dd
Step 9 : ADD ./supervisord/conf.d/* $SCPATH/
 ---> df0d938b53a3
Removing intermediate container dcfa16d0fec2
Step 10 : ADD *.js* $AP/
 ---> b21779fe3194
Removing intermediate container 00d2f6d10444
Step 11 : WORKDIR $AP
 ---> Running in f412220027b5
 ---> 0f84bc7ac153
Removing intermediate container f412220027b5
Step 12 : RUN npm install
 ---> Running in 7340a9041404
npm WARN engine formidable@1.0.13: wanted:
    {"node":"<0.9.0"} (current: {"node":"0.10.33","npm":"2.1.8"})
express@3.2.4 node_modules/express
├── methods@0.0.1
├── fresh@0.1.0
├── range-parser@0.0.4
├── cookie-signature@1.0.1
├── buffer-crc32@0.2.1
├── cookie@0.0.5
├── commander@0.6.1
├── mkdirp@0.3.4
├── debug@2.1.0 (ms@0.6.2)
├── send@0.1.0 (mime@1.2.6)
└── connect@2.7.9 (pause@0.0.1, qs@0.6.4, bytes@0.2.0, formidable@1.0.13)
 ---> 84f3a4bc2336
Removing intermediate container 7340a9041404
Step 13 : CMD supervisord -n
 ---> Running in 23671c2f57b7
 ---> 96eab440b7c8
Removing intermediate container 23671c2f57b7
Successfully built 96eab440b7c8
Warning

To improve the speed of builds, Docker will use a local cache when it thinks it is safe. This can sometimes lead to unexpected issues. In the output above you will notice lines like ---> Running in 23671c2f57b7. If instead you see ---> Using cache, you know that Docker decided to use the cache. You can disable the cache for a build by using the --no-cache argument to the docker build command.

If you are building your Docker images on a system that is used for other simultaneous processes, you can limit the resources avaliable to your builds by utilizing many of the same cgroup methods that we will discuss later in Chapter 5. You can find detailed documentation on the docker build arguments in the official documentation.

Running Your Image

Once you have successfully built the image, you can run it on your Docker host with the following command:

$ docker run -d -p 8080:8080 example/docker-node-hello:latest

The above command tells Docker to create a running container in the background from the image with the example/docker-node-hello:latest tag, and then map port 8080 in the container to port 8080 on the Docker host.

If everything goes as expected, the new Node.js application should be running in a container on the host. You can verify this by running docker ps.

To see the running application in action, you will need to open up a web browser and point it at port 8080 on the Docker host.

You can usually determine the Docker host IP address by simply printing out the value of the DOCKER_HOST environment variable unless you are only running Docker locally, in which case 127.0.0.1 should work. Docker Machine or Boot2Docker users can also simply use docker-machine ip or boot2docker ip, respectively:

$ echo $DOCKER_HOST
tcp://172.17.42.10:2376

Get the IP address and enter something like http://172.17.42.10:8080/ into your web browser address bar.

You should see the following text:

Hello World. Wish you were here.

Environment Variables

If you read the index.js file, you will notice that part of the file refers to the variable $WHO, which the application uses to determine who it is going to say Hello to:

var DEFAULT_WHO = "World";
var WHO = process.env.WHO || DEFAULT_WHO;

app.get('/', function (req, res) {
  res.send('Hello ' + WHO + '. Wish you were here.\n');
});

Let’s quickly learn how you can configure this application by passing in environment variables when you start it.

First you need to stop the existing container using two commands. The first command will provide you with the container ID, which you will need to use in the second command:

$ docker ps
CONTAINER ID  IMAGE                             STATUS       ...
b7145e06083f  example/centos-node-hello:latest  Up 4 minutes ...
Note

With the release of Docker 1.8, it is now possible to format the output of docker ps by utilizing a Go template, so that you only see the information that you care about. In the above example you might decide to run something like docker ps --format "table {{.ID}}\t{{.Image}}\t{{.Status}}" to limit the output to the 3 fields you care about. Additionally, running docker ps --quiet with no format options will limit the output to only the container ID.

And then, using the container ID from the previous output, you can stop the running container by typing:

$ docker stop b7145e06083f
b7145e06083f

You can then restart the container by adding one argument to the previous docker run command:

$ docker run -d -p 8080:8080 -e WHO="Sean and Karl" \
example/docker-node-hello:latest

If you reload your web browser, you should see that the text on the web page now reads:

Hello Sean and Karl. Wish you were here.

Custom Base Images

Base images are the lowest-level images that other Docker images will build upon. Most often, these are based on minimal installs of Linux distributions like Ubuntu, Fedora, or CentOS, but they can also be much smaller, containing a single statically compiled binary. For most people, using the official base images for their favorite distribution or tool is a great option.

However, there are times when it is more preferable to build your own base images that are not based on an image created by someone else. One reason to do this would be to maintain a consistent OS image across all your deployment methods for hardware, VMs, and containers. Another would be to get the image size down substantially. There is no need to ship around an entire Ubuntu distribution, for example, if your application is a statically built C or Go application. You might find that you only need the tools you regularly use for debugging and some other shell commands and binaries. Making the effort to build such an image could pay off in better deployment times and easier application distribution.

In the official Docker documentation, there is some good information about how you can build base images on the various Linux distributions.

Storing Images

Now that you have created a Docker image that you’re happy with, you’ll want to store it somewhere so that it can be easily accessed by any Docker host that you want to deploy it to. This is also the clear hand-off point between building images and putting them somewhere to run. You don’t normally build the images on the server and then run them. Ordinarily, deployment is the process of pulling an image from a repository and running it on one or more Docker servers. There are a few ways you can go about storing your images into a central repository for easy retrieval.

Public Registries

Docker provides an image registry for public images that the community wants to share. These include official images for Linux distributions, ready-to-go WordPress containers, and much more.

If you have images that can be published to the Internet, the best place for them is a public registry, like Docker Hub. However, there are other options. When the core Docker tools were first gaining popularity, Docker Hub did not exist. To fill this obvious void in the community, Quay.io was created. Since then, Quay.io has been purchased by CoreOS and has been used to create the CoreOS Enterprise Registry product, which we will discuss in a moment.

Both Docker Hub and Quay.io provide centralized Docker image registries that can be accessed from anywhere on the Internet, and provide a method to store private images in addition to public ones. Both have nice user interfaces and the ability to separate team access permissions and manage users. Both also offer reasonable commercial options for private SaaS hosting of your images, much in the same way that GitHub sells private registries on their systems. This is probably the right first step if you’re getting serious about Docker but are not yet shipping enough code to need an internally hosted solution.

For companies that use Docker heavily, the biggest downside to these registries is that they are not local to the network on which the application is being deployed. This means that every layer of every deployment might need to be dragged across the Internet in order to deploy an application. Internet latencies have a very real impact on software deployments, and outages that affect these registries could have a very detrimental impact on a company’s ability to deploy smoothly and on schedule. This is mitigated by good image design where you make thin layers that are easy to move around the Internet.

Note

In December of 2015, Docker Hub dropped all support for version 1 of the image registry. If you still need to use Docker clients earlier than 1.6, you will need to run your own registry or use Quay.io.

Private Registries

The other option that many companies consider is to host some type of Docker image registry internally. Before the public registry existed for Docker, the Docker developers released the docker-registry project on GitHub. The docker-registry is a GUI-less Python daemon that can interact with the Docker client to support pushing, pulling, and searching images. The version 1 docker-registry has been deprecated and has now been replaced with the version 2 registry, called Docker Distribution.

Another strong contender in the private registry space is the CoreOS Enterprise Registry. When CoreOS bought Quay.io, it quickly took the codebase and made it avaliable as an easily deployable Docker container. This product basically offers all the same features at Quay.io, but can be deployed internally. It ships as a virtual machine that you run as an appliance, and supports the same UI and interfaces as the public Quay.io.

In April of 2015, Docker released the first version of the Docker Trusted Registry, which had earlier been refered to as Docker Hub Enterprise. The Trusted Registry allows organizations to have a Docker-supported on-premise image registry in their data center or cloud environment.

Authenticating to a Registry

Communicating with a registry that stores container images is part of daily life with Docker. For many registries, this means you’ll need to authenticate to gain access to images. But Docker also tries to make it easy to automate things so it can store your login information and use it on your behalf when you request things like pulling down a private image. By default, Docker assumes the registry will be Docker Hub, the public repository hosted by Docker, Inc.

Creating a Docker Hub account

For these examples, we will create an account on Docker Hub. You don’t need an account to use publicly shared images, but you will need one to upload your own public or private containers.

To create your account, use your web browser of choice to navigate to Docker Hub.

From there, you can either log in via an existing GitHub account or create a new login based on your email address. When you first log in to your new account, you will land on the Docker welcome page, which is where you can configure details about your account.

When you create your account, Docker Hub sends a verification email to the address that you provided during signup. You should immediately log in to your email account and click the verification link inside the email to finish the validation process.

At this point, you have created a public registry to which you can upload new images. The “Global settings” option in your account sidebar will allow you to change your registry into a private one if that is what you need.

Logging in to a registry

Now let’s log in to the Docker Hub registry using our account:

$ docker login
Username: someuser
Password: <not shown>
Email: someone@example.com
Login Succeeded

When we get “Login Succeeded” back from the server, we know we’re ready to pull images from the registry. But what happened under the covers? It turns out that Docker has written a dotfile for us in our home directory to cache this information. The permissions are set to 0600 as a security precaution against other users reading your credentials. You can inspect the file with something like:

$ ls -la ~/.dockercfg
-rw------- 1 someuser someuser 95 Mar  6 15:07 /home/someuser/.dockercfg
$ cat ~/.dockercfg
{"https://index.docker.io/v1/":{"auth":"cmVsaEXamPL3hElRmFCOUE=",
"email":"someone@example.com"}}

Here we can see the .dockercfg file, owned by someuser, and the stored credentials in JSON format. Note that this can support multiple registries at once. In this case, we just have one entry, for Docker Hub, but we could have more if we need it. From now on, when the registry needs authentication, Docker will look in .dockercfg to see if we have credentials stored for this hostname. If so, it will supply them. You will notice that one value is completely lacking here: a timestamp. These credentials are cached forever or when we tell Docker to remove them, whichever comes first.

Just like logging in, we can also log out of a registry if we no longer want to cache the credentials:

$ docker logout
Remove login credentials for https://index.docker.io/v1/
$ ls -la ~/.dockercfg
ls: cannot access /home/someuser/.dockercfg: No such file or directory

Here we removed our cached credentials and they are no longer stored. But something else happened: the file is gone. That’s because it was the only set of credentials that were cached, so Docker has simply removed the file.

If we were trying to log in to something other than the Docker Hub registry, we could supply the hostname on the command line:

$ docker login someregistry.example.com

This would then end up as just another line in our .dockercfg file.

Pushing images into a repository

The first step required to push your image is to ensure that you are logged into the Docker repository you intend to use. For this example we will focus on Docker Hub. So ensure that you are logged into Docker hub with your personal credentials.

$ docker login
Username: someuser
Password: <not shown>
Email: someone@example.com
Login Succeeded

Once you are logged in, you can upload an image. Earlier we used the command docker build -t example/docker-node-hello:latest . to build the docker-node-hello image.

The example portion of that command refers to a repository. When this is local, it can be anything that we want. However, when we are going to upload it to a real repository, we need that to match our login.

Tip

You will need to replace someuser in all the examples with the user that you created in Docker Hub (or whatever repository you decided to use).

We can easily edit the tags on the image that we already created by running the following command:

$ docker tag example/docker-node-hello:latest someuser/docker-node-hello:latest

If you need to rebuild the image with the new naming convention or simply want to give it a try, you can accomplish this by running the following command in the docker-node-hello working directory that was generated when you performed the git checkout earlier in the chapter:

$ docker build -t someuser/docker-node-hello:latest .
...
Note

If you rebuild the image, you may find that it is very fast. This is because most, if not all, of the layers already exist on your Docker server from the previous build.

We can quickly verify that our image is indeed on the server by running docker images:

$ docker images
REPOSITORY                 TAG    IMAGE ID     CREATED        VIRTUAL SIZE
someuser/docker-node-hello latest 69ddbcccd74f 31 minutes ago 649.2 MB
node                       0.10   38c02af29fa3 3 weeks ago    633.3 MB
Tip

With the introduction of Docker 1.10, it is now possible to format the output of docker images to make it more concise by using the --format argument, like this: docker images --format="table {{.ID }}\t {{.Repository }}".

At this point we can upload the image to our Docker repository by using the docker push command:

$ docker push someuser/docker-node-hello:latest
The push refers to a repository [docker.io/someuser/docker-node-hello] (len: 1)
6ad4dd20c832: Pushed
...
9ee13ca3b908: Pushed
latest: digest: sha256:55c0a161f8ec4af4e17a40ca057f97... size: 38338

If this image was uploaded to a public repository, it can now be easily downloaded by anyone in the world by running the docker pull command.

Tip

If you uploaded the image to a private repository, than users must log into the private repository using the docker login command before they will be able to pull the image down to their local system.

$ docker pull someuser/docker-node-hello:latest
latest: Pulling from someuser/docker-node-hello
69ddbcccd74f: Pull complete
Digest: sha256:55c0a161f8ec4af4e17a40ca057f97...
Status: Downloaded newer image for someuser/docker-node-hello:latest

Mirroring a Registry

It is possible to set up a local registry in your network that will mirror images from the upstream public registry so that you don’t need to pull commonly used images all the way across the Internet every time you need them on a new host. This can even be useful on your development workstation so that you can keep a local stash of frequently used images that you might need to access offline.

Note

Currently you can only mirror a registry using the older version 1 registry. This functionality is expected to be added into Docker Distribution with the release of version 2.4.

If you are considering setting up your own registry, you should investigate the Docker Distribution GitHub page and the official documentation for Docker Registry 2.0.

Configuring the Docker daemon

To do this, the first thing that you need to do is relaunch your Docker daemon with the --registry-mirror command-line argument, replacing ${YOUR_REGISTRY-MIRROR-HOST} with your Docker server’s IP address and port number (e.g., 172.17.42.10:5000).

Note

If you plan to run the docker-registry container on your only Docker server, you can set ${YOUR_REGISTRY-MIRROR-HOST} to localhost:5000.

If you already have Docker running, you need to stop it first. This is distribution-specific. You should use the commands you normally use on your distribution, like initctl, service, or systemctl, to stop the daemon. Then we can invoke it manually with this registry mirroring option:

$ docker daemon --registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}

If you would like to ensure that your Docker daemon always starts with this setup, you will need to edit the appropriate configuration file for your Linux distribution.

Boot2Docker ISO

This setup is for anything using a boot2docker.iso, like Docker Machine or the boot2docker command line interface.

Create /var/lib/boot2docker/profile if it doesn’t already exist:

$ sudo touch /var/lib/boot2docker/profile

Then edit /var/lib/boot2docker/profile and append the argument to your EXTRA_ARGS:

EXTRA_ARGS="--registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}"

And then restart the docker daemon:

sudo /etc/init.d/docker restart

Ubuntu

Edit /etc/default/docker and append the argument to your DOCKER_OPTS:

DOCKER_OPTS="--registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}"

And then restart the docker daemon:

sudo service docker.io restart

Fedora

Edit /etc/sysconfig/docker and append the argument to your OPTIONS:

OPTIONS="--registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}"

And then restart the docker daemon:

sudo systemctl daemon-reload
sudo systemctl restart docker

CoreOS

First copy the systemd unit file for Docker to a writeable filesystem:

$ sudo cp /usr/lib/systemd/system/docker.service /etc/systemd/system/

Then, as root, edit /etc/systemd/system/docker.service and append the argument to the end of the ExecStart line:

ExecStart=/usr/lib/coreos/dockerd --daemon --host=fd:// \
$DOCKER_OPTS $DOCKER_OPT_BIP $DOCKER_OPT_MTU $DOCKER_OPT_IPMASQ \
--registry-mirror=http://${YOUR_REGISTRY-MIRROR-HOST}

And then restart the docker daemon:

sudo systemctl daemon-reload
sudo systemctl restart docker

Launching the local registry mirror service

You will now need to launch a container on your Docker host that will run the registry mirror service and provide you with a local cache of Docker images. You can accomplish this by running the registry image as a container with a few important environment variables defined and a storage volume mounted.

On your Docker server, ensure that you have a directory for storing the images:

$ mkdir -p /var/lib/registry

Then you can launch the container, with the following options defined:

$ docker run -d -p 5000:5000 \
    -v /var/lib/registry:/tmp/registry \
    -e SETTINGS_FLAVOR=dev \
    -e STANDALONE=false \
    -e MIRROR_SOURCE=https://registry-1.docker.io \
    -e MIRROR_SOURCE_INDEX=https://index.docker.io \
    registry
Note

The registry supports a lot of different storage backends, including S3, Swift, Glance, Azure Blob Storage, Google Cloud Storage, and more.

Testing the local registry mirror service

Now that the registry is running as a mirror, we can test it. On a Unix-based system, you can time how long it takes to download the newest CentOS image, using the following command:

$ time docker pull centos:latest
Pulling repository centos
88f9454e60dd: Download complete
511136ea3c5a: Download complete
5b12ef8fd570: Download complete
Status: Downloaded newer image for centos:latest

real	1m25.406s
user	0m0.019s
sys	0m0.014s

In this case, it took 1 minute and 25 seconds to pull the whole image. If we then go ahead and delete the image from the Docker host and then re-time fetching the image again, we will see a significant difference:

$ docker rmi centos:latest
Untagged: centos:latest
$ time docker pull centos:latest
Pulling repository centos
88f9454e60dd: Download complete
511136ea3c5a: Download complete
5b12ef8fd570: Download complete
Status: Image is up to date for centos:latest

real	0m2.042s
user	0m0.004s
sys	0m0.005s

Both times that you pulled the centos:latest image, the Docker server connected to the local registry mirror service and asked for the image. In the first case, the mirror service did not have the image so it had to pull it from the official docker-registry first, add it to its own storage, and then deliver it to the Docker server. After you delete the image from the Docker server and then request it again, you’ll see that the time to pull the image will drop to be very low. In the previous code, it took only two seconds for the Docker server to receive the image. This is because the local registry mirror service had a copy of the image and could provide it directly to the server without pulling anything from the upstream public docker-registry.

Other Approaches to Image Delivery

Over the last two years, the community has explored many other approaches to managing Docker images and providing simple but reliable access to images when needed. Some of these projects, like dogestry, leverage the docker save and docker load commands to create and load images from cloud storage like Amazon S3. Other people are exploring the possibilities of using torrents to distribute Docker images, with projects like torrent-docker. Torrents seem like a natural fit because deployment is usually done to a group of servers on the same network all at the same time. Solomon Hykes recently committed that the Docker Distribution project will soon ship a command-line tool for importing and exporting image layers even without a Docker daemon. This will facilitate even more diverse methods of image distribution. As more and more companies and projects begin to use Docker seriously, even more robust solutions are likely to begin to appear to meet the needs of anyone’s unique workflow and requirements.

If you have a scenario in which you can’t use the off-the-shelf mechanisms, such as an isolated network for security concerns, you can leverage Docker’s built-in importing and exporting features to dump and load new images. Unless you have a specific reason to do otherwise, you should use one of the off-the-shelf solutions and only considering changing your approach when needed. The available options will work for almost everyone.

1 Don’t Repeat Yourself.

2 This code was forked from GitHub.

Get Docker: Up & Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.