Docker image tagging scheme

In this short article I'll talk about two things: the latest tag in Docker and why you should be careful when using it and how you should come up with your own tagging scheme for your Docker images.

Docker latest tag

Docker allows you to omit the tags when building images. The image without a tag, gets a default tag called 'latest'. For example:

$ docker pull alpine
Using default tag: latest
...

Notice how Docker automatically falls back to the default tag. It is definitely easier to just use the image name, without looking up the tags. However, using latest can get you in trouble.

The biggest misconception with the latest tag is that it doesn't really mean 'the latest image that was pushed'. It is just a tag that gets added to an image if one is not provided.

Let's say you have the following images in your Docker registry:

myimage:0.1.0
myimage:0.2.0
myimage:latest

If we go off the versions in tags, the image tagged with 0.2.0 is the image with the highest version number. However, a mistake people make is to think that image myimage:latest is the same as 0.2.0 because it's named latest. That's not true. You could re-tag the 0.2.0 image with latest and push it, but you can't assume that was done. If you don't push the latest image explicitly, it's not going to affect any other image.

The other issue with latest is that it's not telling you much. It's not really the latest image pushed, so you have no clue what it is. Imagine using the latest tag in a team of people - it's a mess as no one actually knows what it is. Note that you can make it be the latest image, but then you'll have to make sure whenever an image with specific version is pushed, to tag that same image as latest as well. However, I would caution against that. Stick with specific version numbers or Git commits as then you'll at least know which source version was used to build the image.

This one goes hand in hand with the previous tip. I have seen images sometimes tagged with the environment names, for example myimage:dev or myimage:prod. This is an anti-pattern. One of the benefits of Docker images should be reusability. It means that I can take the same image, apply configuration and environment settings to it (e.g. prod, dev, test configuration) and run it. Tagging images with environment names goes against that.

Semantic versioning is a set of rules and requirements on how version numbers get assigned and incremented. It is good practice to follow it. If you have your own versioning system already in place, feel free to follow that.

In addition to the version tag there are a couple of other options you can consider:

Date/time stamps

One of the pros of using date/time stamp is that you know exactly when the image was built. But the downside is that you don't really which source was used to build the image. If you decide on using the timestamps, make sure you decide on the exact time zone information as well as understand how to correlate the image back to your source code.

Build numbers

I am assuming you're using a build system that already uses build numbers that get automatically incremented with each build. This seems like a better choice for tagging your images as you'll know the exact build that was used to create the image.

Git commits/SHA

Git commit gives you good traceability, however, you can get into a situation where multiple builds are kicked off against the same Git commit - in that case, this option is not unique anymore.

If you're building base images or images other people will depend on you will need to come up with a tagging scheme. You will need to think about how are you going to ship updates to your base images without breaking the image consumers.

For example, let's say you ship version 1.0. You'd tag and push the following images:

Release 1.0
myimage:1
myimage:1.0

If you release a minor update to the image (1.1), you'd build the new image and tag the major version with, meaning that version 1 is the same as 1.1. You'd have the following images in your registry:

Release 1.0	Release 1.1
myimage:1	myimage:1
myimage:1.0	myaimge:1.1

Then if you release 2.0, the tag 1 will represent the latest of the 1.x versions.

Release 1.0	Relase 1.1	Release 2.0
myimage:1	myimage:1	image:2
myimage:1.0	myimage:1.1	image:2.0

Following this tagging schemes allows your users to control the images they are using. If they always want to be on the latest 1.x image, they can pull the myimage:1 image. If they need to lock down to a specific version, they can use e.g. myimage:1.1.