0 0 Share PDF

Images: Tagging vs Digests

Issue

When running docker stack deploy, the pulled image doesn't include an image tag, resulting in a tag of <none>. Specifying a tag or leaving it absent has no effect.

Running docker pull <image> successfully updates the locally cached image with the expected tag.

Short Answer

This is the expected behavior; services resolve the tag you specified to a digest, and pull that image by digest (if it doesn't exist locally). Doing so guarantees that every instance of the service (on any node) runs exactly the same version of the image.

However, there is not a 1:1 relation of digests to tags, so when pulling an image by digest, only the digest is known. If you happen to have an image pulled (manually) with a tag that matches that digest, the tag is shown, but not otherwise.

If you issue docker service inspect for the service, you can see the digest of the image that's used; running docker images --digests should show the same digest for one of the images with tag <none>.

Long Answer

You may notice that after multiple runs of docker stack deploy you end up with a growing accumulation of multiple <none> tagged images. This is to be expected.

The Docker daemon does not automatically garbage collect unused images. You can remove unused images using the docker image prune (and the related docker system prune) commands. The docker image prune command by default only removes images that are untagged, but using the --all option allows you to remove all unused images.

Caution: Running a cron job on each node to periodically garbage collect can be an option for crafting some degree of automation. Please be advised that this should be implemented only after careful consideration, as the default behavior carries the risk of unintentional deletion of images critical to the operation of UCP and DTR.

In the case of pulling the latest tag each time, the image would always be overwritten.

In the previous situation, the same would happen, but the old images would be "untagged". When pulling someimage:sometag, it will un-tag the local someimage:sometag image and tag the new version that was just pulled.

The old image is still there though (but it no longer has a tag, so may not be visible).

docker pull Does Not Remove the Previous Image

Below are some steps to illustrate this.

Spin up a local registry, just for the example (analogous to Docker EE's DTR):

$ docker run -d -p 5000:5000 registry:2 

Build and push an image:

$ docker image build -t localhost:5000/myimage:latest -<<EOF
FROM busybox
RUN echo this is the latest image in the registry > /version.txt
EOF
$ docker image push localhost:5000/myimage:latest

The push refers to repository [localhost:5000/myimage]
4007f96df2a4: Pushed 
0271b8eebde3: Pushed 
latest: digest: sha256:47bfdb88c3ae13e488167607973b7688f69d9e8c142c2045af343ec199649c09 size: 734

After pushing, remove the local image to start fresh:

$ docker image remove localhost:5000/myimage:latest

Untagged: localhost:5000/myimage:latest
Untagged: localhost:5000/myimage@sha256:47bfdb88c3ae13e488167607973b7688f69d9e8c142c2045af343ec199649c09
Deleted: sha256:d39d5dc009d238ac3e69156987e9d31d5ea33aa3e7cdc3e106d6b5cb757ca296
Deleted: sha256:c1b185e2edb28e97f8c987c86c8ca8c42598f5085f95a06bcf888fee4700c576

Build another image with the same tag, but don't push the image (this is to demonstrate what happens with an image that was pulled previously):

$ docker image build -t localhost:5000/myimage:latest -<<EOF
FROM busybox
RUN echo this the image that was pulled previously > /version.txt
EOF

Check the output of docker images:

REPOSITORY                                               TAG                 IMAGE ID            CREATED                  SIZE
localhost:5000/myimage                                   latest              ff6cb44de960        Less than a second ago   1.13MB

Now, pull the image built earlier:

$ docker image pull localhost:5000/myimage:latest

latest: Pulling from myimage
0ffadd58f2a6: Already exists 
c45452c1463d: Pull complete 
Digest: sha256:47bfdb88c3ae13e488167607973b7688f69d9e8c142c2045af343ec199649c09
Status: Downloaded newer image for localhost:5000/myimage:latest

After that, check the images that are locally present:

$ docker images -a
REPOSITORY                                               TAG                 IMAGE ID            CREATED              SIZE
<none>                                                   <none>              ff6cb44de960        About a minute ago   1.13MB
localhost:5000/myimage                                   latest              d39d5dc009d2        2 minutes ago        1.13MB

The previous image is still there, but now is "untagged":

$ docker run --rm ff6cb44de960 sh -c 'cat /version.txt'
this the image that was pulled previously

$ docker run --rm localhost:5000/myimage sh -c 'cat /version.txt'
this is the latest image in the registry

Deploy Stacks with --resolve-image=never

While this is an option for docker stack deploy, it is not a recommended one.

Doing so would revert to the old behavior, where images are just pulled on each node. This used to cause quite some issues and was intended as a stopgap solution at the time (until pinning by digest was implemented). This section illustrates some of the problems with this approach.

Assume this is what the docker-compose.yml looks like:

version: '3'
services:
  app:
    image: 'localhost:5000/myimage:latest'
    tty: true

Deploy the stack using the --resolve-image=never option:

$ docker stack deploy -c docker-compose.yml --resolve-image=never mystack

Creating network mystack_default
Creating service mystack_app

Inspect the service's definition, and check what image is used:

$ docker service inspect --format '{{.Spec.TaskTemplate.ContainerSpec.Image}}' mystack_app
localhost:5000/myimage:latest

Also check the container (task) that was deployed:

$ docker container inspect --format '{{.Image}}' mystack_app.1.9zyy8dopgwl1djl2uasfc7ebx
sha256:d39d5dc009d238ac3e69156987e9d31d5ea33aa3e7cdc3e106d6b5cb757ca296

While all this was going on, someone pushed a new version of the image to the registry.

To mimic this change in situation, build an image, push it, then remove the local image:

$ docker image build -t localhost:5000/myimage:latest -<<EOF
FROM busybox
RUN echo this is the really-really latestest version > /version.txt
EOF
$ docker image push localhost:5000/myimage:latest

The push refers to repository [localhost:5000/myimage]
a6bcce9fc2e9: Pushed 
0271b8eebde3: Layer already exists 
latest: digest: sha256:54f12e6b6ce1a2504b2ab5a8c4caae81ee5701605fe5ce25a8e1a92f3af375e8 size: 734

$ docker image rm localhost:5000/myimage:latest

Untagged: localhost:5000/myimage:latest
Untagged: localhost:5000/myimage@sha256:54f12e6b6ce1a2504b2ab5a8c4caae81ee5701605fe5ce25a8e1a92f3af375e8
Deleted: sha256:148612cb66d9966f5970e4b4f286d9e536cfb70c8704cdfe5dce5ebe5b6345b5
Deleted: sha256:7b241042d832d0265ac4e8ed64b4726f350151ff21a109b5c985120b7294608b

(we remove the image locally to force docker to pull the "latest" image from the registry in the next step below)

Now, scale the service to add another replica:

$ docker service scale mystack_app=2

Scaling a service is just one reason for Docker to deploy a new task. There are many reasons why new tasks can be deployed, such as:

  • a task failed, and docker starts a new task to replace it
  • a node failed, and docker starts a new task on a different node to replace it
  • a constraint is no longer met (e.g. --constraint=node.labels.foobar==baz, and the node label was updated)
  • the service is scaled
  • … etc …

After scaling, inspect both instances / tasks of the service:

$ docker container ls -q --filter label=com.docker.swarm.service.name=mystack_app | xargs docker container inspect --format '{{.Image}}'

sha256:148612cb66d9966f5970e4b4f286d9e536cfb70c8704cdfe5dce5ebe5b6345b5
sha256:d39d5dc009d238ac3e69156987e9d31d5ea33aa3e7cdc3e106d6b5cb757ca296

Both instances are now running a different version:

$ docker container ls -q --filter label=com.docker.swarm.service.name=mystack_app | xargs -I % docker exec % sh -c 'cat /version.txt'
this is the really-really latestest version
this is the latest image in the registry

This is now a problem, because different instances of the same service now run different versions of the application; this can lead to hard-to-find issues, such as:

  • Depending on which node (or instance) visitors end up on, they may be served different content
  • A security update was made to the service, but some instances still run an old version
  • A bug was fixed, but for "some reason", some nodes still expose the bug
  • The latest update contains a bug, but it goes unnoticed, because most instances are still running the previous version

Best Practices

What are best practices in this situation? There is not an easy question to answer because it really depends on your use case and expectations.

If this is a production situation, and security and stability are important, then just "convenience" is likely not the best deciding factor (any more than leaving your house unlocked all the time might be "convenient").

But as mentioned above, it really depends on the situation:

  • You could consider each push to a registry a release - in some form (after all, you're publishing a new version of your code, and making it accessible to others).
  • :latest is comparable to the master branch in a Git repository. Is each push to master considered ready to go into production?
  • Releases will (usually) go through a verification process (CI/QA/acceptance/etc). Should changes in master be verified first, and only after verification be (tagged, and) deployed to production?
  • Releases carry a version; this can either be an explicit version (a tag), or implicit (immutable tag: the image's digest).

Tagging

Tagging is best practice because

  • Human-readable versioning of your images
  • Easier to run a specific version of an image
  • Can aid in the process (a release is tagged once it is verified/approved for production)

Pinning-by-Digest

Pinning-by-digest is best practice because

  • Tags are mutable, so there is no guarantee a tag will never change.
  • It guarantees that every instance of the service is running exactly the same code.
  • It allows you to put an image through QA/testing and verify that that version of the image is approved to go into production.
  • You can use Docker Content Trust and sign specific versions of the image.
  • You can roll back to an earlier version of the image, even if that version was not tagged (or "no longer tagged").
  • Digests also prevent race-conditions; if a new image is pushed while a deploy is in progress, different nodes may be pulling the images at different times, so some nodes have the new image, some have the old one.

Manually specifying a digest is a bit involved. That's why services automatically resolve tags to digests to make this easy to use.

Further Reading