Enable Dockerfiles to build, and tag, multiple images #5726
Labels
area/builder
area/distribution
kind/feature
Functionality or other elements that the project doesn't currently have. Features are new and shiny
Problem
Dockerfiles are almost ideal for building code from source into binaries in a deterministic containerized environment. (In this sense, they serve as a more flexible version of a buildpack.)
However, currently, the result of this build is always a docker image which descends linearly from the source-image, and which thus contains the toolchain used to compile it. In some few cases, this is acceptable--when there is no difference between the toolchain required to operate on the source, and the runtime required to execute the binaries, the toolchain image can "specialize" into the runtime image without any sort of heavy removals.
But this case is rare. More often, your runtime will be a tiny VM or library, but the toolchain will depend on the ability to compile dependencies written in other languages (e.g. C), which themselves require -devel versions of libraries, and so forth. None of this stuff is necessary at runtime, but for compilation to succeed, it must be part of the toolchain, and thus the toolchain must be a multi-gigabyte image that would be ridiculous to pull down to your production cloud-instances et al.
People currently sidestep this problem in a number of ways:
docker squash
command--to allow the runtime image to actually end up smaller than the toolchain image, even though it descends from the toolchain image.)None of these workarounds obey the spirit of Dockerfile builds: deterministically turning a source image, plus a context, into a destination image.
Proposed solution
I propose an alternative, which would look something like the following:
This presumes one additional Dockerfile stanza, and one change-in-behavior of a current Dockerfile stanza:
BINDCONTEXT
, would be as discussed in Proposal: Dockerfile add BIND_CONTEXT #3056, but specifically giving us read-write access to the the ephemeral, uploaded context. The point of this is not optimization, but rather to give intermediate layers a "scratch volume" to work with, whose contents won't end up in the container, but which can be acted on by, and referred to from, other commands.FROM
stanza, permitting it to appear more than once in a Dockerfile, would be as follows: when aFROM
statement is encountered, the layer pointer which the next-created layer will parent upon is reset from the last-created layer to the newly-specified image, This is a generalization of the previous behavior ofFROM
; all currentFROM
stanzas could be considered to be resetting the layer pointer from a null layer. Importantly,FROM
unmounts any previously-specifiedBINDCONTEXT
mount, but the contents of the context persist from their previous state, and will be in that state if they are mounted again.Together, these two alterations allow you to have a Dockerfile which creates multiple images, keeping state from the creation of one to the next. If you ran this Dockerfile using
docker build -t foo .
, it would be the final image--the terminal position of the layer pointer--that would end up being tagged as "foo." The other one would be remain a stack of untagged layers, which could be reused in builds, or flushed away at need.Going further
Besides the potential workflow presented in the Dockerfile above, a few more possibilities open up by allowing for a third stanza:
TAG
: like indocker tag
, this gives the layer resulting from the previous stanza a name that can be used to refer to it. Unlike indocker tag
, the name will only persist for the duration of the build. All such "local" tags look likenearestglobalparenttag+localname
, or just+localname
when the nearest global parent tag is unambiguous.Then you could do something like this:
This stanza would require a slight change in the behavior of
docker build -t
, adding adocker build -t global:local
switch to create global tags from local tags. For example, given the above Dockerfile:For backwards compatibility,
-t global
would be short for-t global:@END
, where@END
would explicitly refer to the last layer created in the Dockerfile.Postscript
Given multiple
FROM
,BINDCONTEXT
, andTAG
stanzas, an alternate syntax for Dockerfiles might be considered:The text was updated successfully, but these errors were encountered: