New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockerfiles should have a way to perform multiple build actions in one commit #2439
Comments
I hadn't seen #1799, but I've seen that option and do not find it overly appealing; I don't like run-on commands (using &&), and the only reason I see to have line continuations is for a single command invocation with a lot of arguments (such as apt-get install with a massive list of packages). I might seem to be coming off something of a shell-script prude here, but I like things to be "clean". I have seen #332, or something like it, as I recall having seen shykes' message about exporting to tarball and re-importing, which is exactly what I'm doing currently. Unfortunately, by exporting and reimporting at every step, the base image becomes increasingly massive, to the point that I've got a 1.5GB base image for the final set of changes to build from, and that now seems to be having a checksum mismatch issue when pushing to a local private registry. So I have a couple of options. I can rearrange the commands and liberally apply && and -line continuations in an effort to decrease the number of layers, which just seems hacky and prone to error to me. Or, I can look at improving the system itself (assuming others see it as an improvement?) I have posted to the mailing list here: https://groups.google.com/forum/?fromgroups#!topic/docker-dev/CbmK1KUS8Sk |
Ah, sorry, I forgot to mention: the problem should go away with Docker 0.7 since AUFS will be replaced by another backend which doesn't have that limitation! That's why you don't see tremendous efforts to work around it. The other options (explicit COMMIT, squashing image histories...) are still nice in the long run; but we have a short-to-mid-run option coming fast that will alleviate the issue :-) |
I'll chime in here to say that in my opinion, AUFS's limit is not the problem. Storing multiple layers which have no intrinsic value is the problem. Let's say I run one command that uses up a tremendous number of file descriptors for temporary files, perhaps something like apt-get or make. If a layer is captured after the completion of that command, there is never an opportunity to remove those file descriptors from the layer. Doing so on the next command simply hides them rather than actually removing them. This leads to unnecessary consumption of disk space in that now-preserved layer, along with the performance penalty of having to union that filesystem index when searching for a file. Simply increasing the available layers without the ability to expunge useless ones is just going to make the penalties more prevalent. There should be a good way to a) indicate to the dockerfile that you don't want intermediary layers and b) squash useless layers after the fact without blowing your inheritance from other images. Continuation lines in dockerfiles aren't a good solution either, as they just result in syntactic eccentricities like this: https://gist.github.com/SamSaffron/7208665. My $.02. |
I agree there is a final fix (don't require a commit for each build step), and there are intermediary fixes (for example 0.7.2 will raise the layers limit to 127). I'm tentatively scheduling this for 0.8. |
I believe that, perhaps, extra commands in the docker vocabulary would be apt? Perhaps even allowing the use of git-style add/commit/merge/rebase/stash/cherry-pick/etc commands? The more I see docker, the more I think of version control for app environments. But, being that I love git for version control, I then find myself yearning for git-style level of control with git-porcelain/git-plumbing extensibility. This makes me wonder if I can set up a git repository with multiple branches, and on each branch, attempt different docker pull image commands? Would that work? |
I am very fond of this idea. I've written out a detailed proposal bellow. Abstract: Add a Example
The last command Motivation: There two clear reasons why a person would want to combine multiple steps into one layer:
Current work around: Proposal proper: A new You can use this keyword to save space:
Or to use some private file that you don't want in the resulting image:
This is used to sign an access list with your private gpg key without leaving your gpg key in the resulting image. Implementation sketch: Many of the things we need to implement this are here: #4232 I haven't looked at the parsing code. |
With a 'layer' that is opaque, and a method to move commands into and out of the layer, there would be a compelling workflow that would allow images to docker file and back. The biggest hurdle I've seen from even using docker is that, even if I could hide private info, once I decide on an image or a docker file format, I'm kind of stuck: ( With layer and import/export, there could be some greater reuse and deployment. This doesn't touch upon docker load time settings APIs that should exist, but it's a start :) |
@timthelion I agree with this with only one exception: we should use |
@timthelion Also, why the no-op argument? If we want to have an argument, make docker tag the resultant layer with the given argument. If that is not intended functionality, then don't include the argument at all. There is no Dockerfile directive that has no-op arguments (AFAIK), why add one now? Users expect that arguments to a directive actually affect the directive. Comments are used for no-ops. |
@cyphar I'd suggest that BEGIN and END would be similarly reasonable tokens for denoting the beginning and end of a block. I only suggest this over {} because the rest of the Dockerfile syntax thus far has been word-based. |
@bwilkins I say
|
@cyphar in that case, Coming from ruby myself, I tend to synonymise The benefits of |
What's the status of this? I would love this feature and could try writing it if it's wanted but not being implemented. |
@erikh can comment on if his new Dockerfile parser can deal with recursively defined grammars (assuming we want to allow for nested As an aside, I'd think that the best way of going about this is by calling each instruction "atomic", and that blocks allow users to run multiple instructions such that the whole thing is "atomic". However, I think that changes to the builder would be quite drastic, because while it may seem that the only difference for the instructions in blocks is that they don't commit the changes until the block is exited -- there is a problem. Docker is oriented around images and running multiple instructions on a single container would prove to be an annoying thing to implement. |
I think this is really needed for blocks like Because web-server.tar.gz can be a big file. And it'll be stored in some of layers |
Hello! Mainly:
Then from there, patches/features like this can be re-thought. Hope you can understand. |
I'm using multiple docker files to build up an environment (with the aim of a set of environments). Unfortunately I blow through AUFS's 42 layer limitation with this. I would like to be able to collapse a set of actions into a single commit.
I envision this being done as perhaps BEGIN and COMMIT commands (akin to SQL's transaction commands). I may look at doing this myself if I can find the time, but if this doesn't fit into the ideal of the app then I don't want to put too much effort into it.
My work-around in the meantime is to have a bash script that builds each Dockerfile then exports to tar and reimports into the same image tag, before beginning the next build.
The text was updated successfully, but these errors were encountered: