Docker containers are on the rise as a best practice for deploying and managing cloud-native distributed systems. Containers are instances of Docker images. It turns out that there is a lot to know and understand about images.
Docker From the Ground Up: Building Images
In this two-part tutorial, I’m covering Docker images in depth. In part one I discussed the basic principles, design considerations, and inspecting image internals. In this part, I cover building your own images, troubleshooting, and working with image repositories.
When you come out on the other side, you’ll have a solid understanding of what Docker images are exactly and how to utilize them effectively in your own applications and systems.
There are two ways to build images. You can modify and existing container and then commit it as a new image, or you can write a Dockerfile and build it to an image. We’ll go over both and explain the pros and cons.
With manual builds, you treat your container like a regular computer. You install packages, you write files, and when it’s all said and done, you commit it and end up with a new image that you use as a template to create many more identical containers or even base other images on.
Let’s start with the alpine image, which is a very small and spartan image based on Alpine Linux. We can run it in interactive mode to get into a shell. Our goal is to add a file called “yeah” that contains the text “it works!” to the root directory and then create a new image from it called “yeah-alpine”.
Here we go. Nice, we’re already in the root dir. Let’s see what’s there.
> docker run -it alpine /bin/sh / # ls bin dev etc home lib linuxrc media mnt proc root run sbin srv sys tmp usr var
What editor is available? No vim, no nano?
/ # vim /bin/sh: vim: not found / # nano /bin/sh: nano: not found
Oh, well. We just want to create a file:
/ # echo "it works!" > yeah / # cat yeah it works!
I exited from the interactive shell, and I can see the container named “vibrant_spenc” with
docker ps --all. The
--all flag is important because the container is not running anymore.
> docker ps --all CONTAINER ID IMAGE COMMAND CREATED STATUS NAMES c8faeb05de5f alpine "/bin/sh" 6 minutes ago Exited vibrant_spence
Here, I create a new image from the “vibrate_spence” container. I added the commit message “mine, mine, mine” for good measure.
> docker commit -m "mine, mine, mine" vibrant_spence yeah-alpine sha256:e3c98cd21f4d85a1428...e220da99995fd8bf6b49aa
Let’s check it out. Yep, there is a new image, and in its history you can see a new layer with the “mine, mine, mine” comment.
> docker images REPOSITORY TAG IMAGE ID SIZE yeah-alpine latest e3c98cd21f4d 4.8 MB python latest 775dae9b960e 687 MB d4w/nsenter latest 9e4f13a0901e 83.8 kB ubuntu-with-ssh latest 87391dca396d 221 MB ubuntu latest bd3d4369aebc 127 MB hello-world latest c54a2cc56cbb 1.85 kB alpine latest 4e38e38c8ce0 4.8 MB nsqio/nsq latest 2a82c70fe5e3 70.7 MB > docker history yeah-alpine IMAGE CREATED SIZE COMMENT e3c98cd21f4d 40 seconds ago 66 B mine, mine, mine 4e38e38c8ce0 7 months ago 4.8 MB
Now for the real test. Let’s delete the container and create a new container from the image. The expected result is that the “yeah” file will be present in the new container.
> docker rm vibrant_spence vibrant_spence > docker run -it yeah-alpine /bin/sh / # cat yeah it works! / #
What can I say? Yeah, it works!
Using a Dockerfile
Creating images out of modified containers is cool, but there is no accountability. It’s hard to keep track of the changes and know what the specific modifications were. The disciplined way to create images is to build them using a Dockerfile.
The Dockerfile is a text file that is similar to a shell script, but it supports several commands. Every command that modifies the file system creates a new layer. In part one we discussed the importance of dividing your image into layers properly. The Dockerfile is a big topic in and of itself.
Here, I’ll just demonstrate a couple of commands to create another image, “oh-yeah-alpine”, based on a Dockerfile. In addition to creating the infamous “yeah” file, let’s also install vim. The alpine Linux distribution uses a package management system called “apk”. Here is the Dockerfile:
FROM alpine # Copy the "yeah" file from the host COPY yeah /yeah # Update and install vim using apk RUN apk update && apk add vim CMD cat /yeah
The base image is alpine. It copies the “yeah” file from the same host directory where the Dockerfile is (the build context path). Then, it runs
apk update and installs vim. Finally, it sets the command that is executed when the container runs. In this case it will print to the screen the content of the “yeah” file.
OK. Now that we know what we’re getting into, let’s build this thing. The “-t” option sets the repository. I didn’t specify a tag, so it will be the default “latest”.
> docker build -t oh-yeah-alpine . Sending build context to Docker daemon 3.072 kB Step 1/4 : FROM alpine ---> 4e38e38c8ce0 Step 2/4 : COPY yeah /yeah ---> 1b2a228cc2a5 Removing intermediate container a6221f725845 Step 3/4 : RUN apk update && apk add vim ---> Running in e2c0524bd792 fetch http://dl-cdn.alpinelinux.org/.../APKINDEX.tar.gz fetch http://dl-cdn.alpinelinux.org.../x86_64/APKINDEX.tar.gz v3.4.6-60-gc61f5bf [http://dl-cdn.alpinelinux.org/alpine/v3.4/main] v3.4.6-33-g38ef2d2 [http://dl-cdn.alpinelinux.org/.../v3.4/community] OK: 5977 distinct packages available (1/5) Installing lua5.2-libs (5.2.4-r2) (2/5) Installing ncurses-terminfo-base (6.0-r7) (3/5) Installing ncurses-terminfo (6.0-r7) (4/5) Installing ncurses-libs (6.0-r7) (5/5) Installing vim (7.4.1831-r2) Executing busybox-1.24.2-r9.trigger OK: 37 MiB in 16 packages ---> 7fa4cba6d14f Removing intermediate container e2c0524bd792 Step 4/4 : CMD cat /yeah ---> Running in 351b4f1c1eb1 ---> e124405f28f4 Removing intermediate container 351b4f1c1eb1 Successfully built e124405f28f4
Looks good. Let’s verify the image was created:
> docker images | grep oh-yeah oh-yeah-alpine latest e124405f28f4 About a minute ago 30.5 MB
Note how installing vim and its dependencies bloated the size of the container from the 4.8MB of the base alpine image to a massive 30.5MB!
It’s all very nice. But does it work?
> docker run oh-yeah-alpine it works!
Oh yeah, it works!
In case you’re still suspicious, let’s go into the container and examine the “yeah” file with our freshly installed vim.
> docker run -it oh-yeah-alpine /bin/sh / # vim yeah it works! ~ ~ . . . ~ "yeah" 1L, 10C
The Build Context and the .dockerignore file
I didn’t tell you, but originally when I tried to build the oh-yeah-alpine image, it just hung for several minutes. The issue was that I just put the Dockerfile in my home directory. When Docker builds an image, it first packs the whole directory where the Dockerfile is (including sub-directories) and makes it available for COPY commands in the Dockerfile.
Docker is not trying to be smart and analyze your COPY commands. It just packs the whole thing. Note that the build content will not end in your image, but it will slow down your build command if your build context is unnecessarily large.
In this case, I simply copied the Dockerfile and the “yeah” into a sub-directory and ran the docker build command in that sub-directory. But sometimes you have a complicated directory tree from which you want to copy specific sub-directories and files and ignore others. Enter the .dockerignore file.
This file lets you control exactly what goes into the build context. My favorite trick is to first exclude everything and then start including the bits and pieces I need. For example, in this case I could create the following .dockerignore file and keep the Docker file and the “yeah” in my home directory:
# Exclude EVERYTHING first * # Now selectively include stuff !yeah
There is no need to include the “Dockerfile” itself or the “.dockerignore” file in the build context.
Copying vs. Mounting
Copying files into the image is sometimes what you need, but in other cases you may want your containers to be more dynamic and work with files on the host. This is where volumes and mounts come into play.
Mounting host directories is a different ball game. The data is owned by the host and not by the container. The data can be modified when the container is stopped. The same container can be started with different host directories mounted.
Tagging images is very important if you develop a microservices-based system and you generate a lot of images that must be sometimes associated with each other. You can add as many tags as you want to an image.
You’ve already seen the default “latest” tag. Sometimes, it makes sense to add other tags, like “tested”, “release-1.4”, or the git commit that corresponds to the image.
You can tag an image during a build or later. Here’s how to add a tag to an existing image. Note that while it’s called a tag, you can also assign a new repository.
> docker tag oh-yeah-alpine oh-yeah-alpine:cool-tag > docker tag oh-yeah-alpine oh-yeah-alpine-2 > docker images | grep oh-yeah oh-yeah-alpine-2 latest e124405f28f4 30.5 MB oh-yeah-alpine cool-tag e124405f28f4 30.5 MB oh-yeah-alpine latest e124405f28f4 30.5 MB
You can also untag by removing an image by its tag name. This is a little scary because if you remove the last tag by accident, you lose the image. But if you build images from a Dockerfile, you can just rebuild the image.
> docker rmi oh-yeah-alpine-2 Untagged: oh-yeah-alpine-2:latest > docker rmi oh-yeah-alpine:cool-tag Untagged: oh-yeah-alpine:cool-tag
If I try to remove the last remaining tagged image, I get an error because it is used by a container.
> docker rmi oh-yeah-alpine Error response from daemon: conflict: unable to remove repository reference "oh-yeah-alpine" (must force) - container a1443a7ca9d2 is using its referenced image e124405f28f4
But if I remove the container…
> docker rmi oh-yeah-alpine Untagged: oh-yeah-alpine:latest Deleted: sha256:e124405f28f48e...441d774d9413139e22386c4820df Deleted: sha256:7fa4cba6d14fdf...d8940e6c50d30a157483de06fc59 Deleted: sha256:283d461dadfa6c...dbff864c6557af23bc5aff9d66de Deleted: sha256:1b2a228cc2a5b4...23c80a41a41da4ff92fcac95101e Deleted: sha256:fe5fe2290c63a0...8af394bb4bf15841661f71c71e9a > docker images | grep oh-yeah
Yep. It’s gone. But don’t worry. We can rebuild it:
> docker build -t oh-yeah-alpine . > docker images | grep oh-yeah oh-yeah-alpine latest 1e831ce8afe1 1 minutes ago 30.5 MB
Yay, it’s back. Dockerfile for the win!
Working With Image Registries
Images are very similar in some respects to git repositories. They are also built from an ordered set of commits. You can think of two images that use the same base images as branches (although there is no merging or rebasing in Docker). An image registry is the equivalent of a central git hosting service like GitHub. Guess what’s the name of the official Docker image registry? That’s right, Docker Hub.
When you run an image, if it doesn’t exist, Docker will try to pull it from one of your configured image registries. By default it goes to Docker Hub, but you can control it in your “~/.docker/config.json” file. If you use a different registry, you can follow their instructions, which typically involve logging in using their credentials.
Let’s delete the “hello-world” image and pull it again using the
docker pull command.
> dockere images | grep hello-world hello-world latest c54a2cc56cbb 7 months ago 1.85 kB > docker rmi hello-world hello-world
It’s gone. Let’s pull now.
> docker pull hello-world Using default tag: latest latest: Pulling from library/hello-world 78445dd45222: Pull complete Digest: sha256:c5515758d4c5e1e...07e6f927b07d05f6d12a1ac8d7 Status: Downloaded newer image for hello-world:latest > dockere images | grep hello-world hello-world latest 48b5124b2768 2 weeks ago 1.84 kB
The latest hello-world was replaced with a newer version.
Pushing images is a little more involved. First you need to create an account on Docker Hub (or other registry). Next, you log in. Then you need to tag the image you want to push according to your account name (“g1g1” in my case).
> docker login -u g1g1 -p <password> Login Succeeded > docker tag hello-world g1g1/hello-world > docker images | grep hello g1g1/hello-world latest 48b5124b2768 2 weeks ago 1.84 kB hello-world latest 48b5124b2768 2 weeks ago 1.84 kB
Now, I can push the g1g1/hello-world tagged image.
> docker push g1g1/hello-world The push refers to a repository [docker.io/g1g1/hello-world] 98c944e98de8: Mounted from library/hello-world latest: digest: sha256:c5515758d4c5e...f6d12a1ac8d7 size: 524
Docker images are the templates to your containers. They are designed to be efficient and offer maximum reuse by using a layering file system storage driver.
Docker provides a lot of tools for listing, inspecting, building and tagging images. You can pull and push images to image registries like Docker Hub to easily manage and share your images.