Six-Stage Docker Builds

May 15, 2024

Overview

As a docker-based project matures, it inevitably ends up with three distinct images:

release: image that is tested by QA and eventually released to production. For optimal efficiency, this image must be hardened by only installing vital requirements.
test: image used by the CI build to run unit tests. This image should match production as close as possible, but must still contain the code and dependencies to run all tests.
development: image used in local development. This is especially important when development is done inside docker (e.g. VSCode Remote Containers, or Pycharm's Docker interpreter). This container requires all source code, along with special tools for everyday development.

Classical Approach

For the sake of simplicity, let's assume the project uses a scripted language (e.g. Python, Ruby). Similar approaches can be taken for compiled languages, but the technique is different since the release image contains no source code.

The standard design for these three images is to create three Docker stages. The release image installs minimal requirements, the test stage installs more dependencies on top of release, and development installs more dependencies on top of release.

This design fails to account for the high transitivity of source code and test code. To include code in this design, you have two less than ideal options:

Add the source code in the release stage, and test code in the test stage. This design is flawed because source code changes triggers the entire test and development build stages, which can take a long time to complete.
Install the relevant dependencies and source code separately for every stage. This leads to a ton of duplicate code, and leads to significant inconsistencies between production and the CI build.

Battle-Tested Approach

First, let's frame the issue. The ideal solution to this problem should:

Be easy to rebuild when source code is changed during development.
Provide the most consistency between release, test and development as possible, to detect production issues ASAP.
Be DRY where possible.

To accomplish these goals, three docker stages is not enough. Instead, it is very useful to separate your dependencies and source code into two sets of stages.

release-dependencies: installs all dependencies required for all environments.
test-dependencies: builds upon release-dependencies and installs more tools for CI and local development.
development-dependencies: builds upon test-dependencies and installs tools for local development only.
release: builds upon release-dependencies and adds source code required for all environments.
test: builds upon test-dependencies, copies code from release, and adds test code.
development: builds upon development-dependencies, copies code from test, and adds development code (if needed).

Though slightly more complex, this six-stage design distinct advantages over the classic three-stage approach:

When source code is changed during development, it does not affect any of the -dependencies stages. This means that dependencies to not need to be re-installed, leading to fast rebuild times.
When source code is added to the release image, it is automatically added to test and development.
It is very easy to add a dependency to all stages without any code duplication.

Further Learning

This six-stage docker build approach has been demonstrated in this Github repo. In the future I might make a similar example with a compiled language, as the technique for six-stage builds is slightly different. In any case, I encourage you to try this six-stage build approach in your own Docker build!

Google Sites

Report abuse