Proposal: System-level containers #18724

cpuguy83 · 2015-12-16T20:37:20Z

Problem

A lot of people want to run system-level services, and even docker plugins in containers.
This means these services need to be started before any other container.

An example of this is RancherOS, which currently runs 2 daemons, one for system services and one for user services.... this is to ensure startup order and protection of "system" containers from accidental removal.

There are also cases like the swarm-agent, where when the agent is run in a container, swarm will hide that from the default ps output.

Proposed solution

In the past we've talked about having a plugin loading API, which would essentially be fancy containers... this may be a bit drastic at this point.
What I propose is adding a new bool flag --system (or HostConfig.System in the API) which does exactly 2 things:

Makes sure the container will startup before non-system containers
Hides the container from docker ps output, though will be visible in docker ps -a. Along with this there can also be a filter to show only system containers.

It is important to limit the scope to just these two things and nothing fancy or magical.
To start a system service you'd use the same exact UX/API as a normal container, if you want a volume, then set a volume, if you want API access mount the socket, if you need privileged use --privileged
A containerized volume plugin might use this like so:
docker run -d --system -v /var/myPlugin:/var/data:shared cpuguy83/myAwesomeVolumePlugin
Again, nothing fancy.

_Note that the name of the flag is the best I can come up with, not married to it. Also does not neccessarily have to be a boolean, but I think this helps to limit the scope_

The text was updated successfully, but these errors were encountered:

cpuguy83 · 2015-12-16T20:38:01Z

ping @ibuildthecloud

duglin · 2015-12-16T21:41:51Z

Does this need to apply to other things too - like volumes, networks, etc? I would think a "system" volume might want to be hidden from normal users in the same way.

Its too bad that we don't have the notion of users as this is starting to head down the path of scoping the visibility of resources based on something and something like a user's ACL would be the most natural thing to use here.

If all we want to solve is this one problem then I like this solution, its nice and easy, but I think it might be worth exploring what people will naturally want to do next and make sure this one boolean flag is really enough, otherwise it might be deprecated pretty quickly.

LK4D4 · 2015-12-16T22:12:11Z

Yes, a problem exists. But your flag trying to solve two problems at once: order and accidental removal. Fixing them separately or, at least, having the flexibility to turn one without other would be a great improvement too.

cpuguy83 · 2015-12-16T22:16:59Z

I could see modifying this to not deal with automatic filtering. As I was typing it out I was thinking it may be too much.

abronan · 2015-12-17T06:47:57Z

I tend to agree with duglin on this one, I think user ACL is the most natural way to deal with this. For example in Swarm, there would be an admin/maintainance account to deal with the Swarm containers and deploy more nodes/launch more agents and managers. This would ensure that other regular users cannot even list with docker ps -a and/or delete those containers by any mean.

This does not deal with the issue of starting containers in a specific order but with this basic construct then we can go on and have for example a system user for system containers that would have the priority at daemon startup.

The flag does solve the simple case of a set of containers to start before regular containers but what if I want to have a dependency graph and also start system containers in a specific order? ACL could solve that by giving priorities at daemon startup to users/system services (even though this solution is far from being ideal from a usability perspective, just the first thing that popped in my mind 😄)

bboreham · 2015-12-17T15:40:42Z

What should be the result if a system container cannot start at all, or repeatedly starts and fails?

(I have discussed this at moby/libnetwork#813)

If the answer is "Docker should continue to attempt starting the system container and not let you do anything else before it succeeds", then there needs to be some way of defeating this, since you need to use Docker to change the configuration.

cpuguy83 · 2015-12-17T15:48:39Z

@bboreham In that case for a first step I'd just log the error as we do today and keep going.

bboreham · 2015-12-17T16:02:56Z

@cpuguy83 This is not what happens for network plugins: the operation is retried several times, often enough to time-out the startup and leave you with nothing.

cpuguy83 · 2015-12-17T16:06:05Z

@bboreham That is how plugins work, the above proposal doesn't change that.
Container startup != plugin communication, and something that is --system would not necessarily be a plugin.

bboreham · 2015-12-17T16:31:53Z

@cpuguy83 ok, thanks for clarifying.

bboreham · 2015-12-21T10:33:32Z

Makes sure the container will startup before non-system containers

Can I just clarify some more: your proposal is to change the order in which the fork/exec operations are done, or to do something extra to establish that the system container has actually started, perhaps even to know that it is "ready"?

cpuguy83 · 2015-12-21T12:05:29Z

@bborehem neither... It just attempts to start them before normal containers.

bboreham · 2015-12-21T12:10:40Z

I think your "attempts to start" is what I meant by "fork/exec"; the important part is that it doesn't give you any guarantees about the order in which those processes will then be scheduled by the kernel, and particularly no guarantee that the system containers will be ready before the non-system ones need them.

Right?

cpuguy83 · 2015-12-21T12:18:38Z

Docker provides the odering.
In terms of "readiness", that's up to the services which consume them, not Docker.

jainvipin · 2015-12-30T08:41:20Z

+1

docker rm behavior: accidental removal of these containers can be damaging. Say, I want to remove all app-containers, for which I am used to doing docker rm -f $(docker ps -aq) - it might also remove all non system/infra containers. Perhaps the long-term answer would be to have ACLs as suggested by @duglin. It would be best to not show these containers in docker ps output all together unless explicitly asked for.
starting a infra container later: it is best if order behavior is decoupled from whether it shows up in docker ps, etc. because I can think of starting a system/infra service afterwards (like how I can do it with systemd)

cpuguy83 · 2016-01-06T14:35:03Z

So, in summary -- what we really need is a way to specify that a container should start before any other containers.
What this also implies is that these "system" containers cannot be linked to non-"system" containers.

duglin · 2016-01-06T14:39:57Z

One thing to keep in mind is that the entire notion of ordering is one that we should probably discourage people from even worrying about. At any time any component can do down and users of that component need to be prepared to deal with it - whether its pausing, retrying, or something else. So, to me dealing with a dependent component going down/restarting isn't much different than me dealing with it being started after me. Providing a way to specify ordering could be just giving people a false sense of security.

bboreham · 2016-01-06T14:48:53Z

I tried to get this across earlier but I think I failed:

The order in which you start some processes gives no guarantee over the order in which they will subsequently get CPU time in order to achieve anything. That is up to the OS scheduler, and it may have other things to think about.

It seems to me unwise to add a feature which usually does what you want but sometimes doesn't.

cpuguy83 · 2016-01-06T14:58:39Z

I agree, and in cases where these are docker plugins that plugin loading is handled correctly.. ie. containers that rely on a plugin resource are not delayed anyway until the plugin is available.

I'm going to close this as I think we can handle this in other ways.

bboreham · 2016-01-18T10:25:31Z

As mentioned at moby/libnetwork#882, there is a mirror-image problem at shutdown - you shouldn't shut down a container that is implementing some "system" feature before shutting down a container using that feature.

cpuguy83 added the kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. label Dec 16, 2015

thaJeztah mentioned this issue Dec 28, 2015

pkg: authorization: do not lazy load #18824

Closed

cpuguy83 closed this as completed Jan 6, 2016

bboreham mentioned this issue Feb 18, 2016

Better Plugin Infrastructure #20363

Closed

bboreham mentioned this issue Apr 22, 2016

Running Docker network driver plugin as a Docker container #22239

Closed

anusha-ragunathan mentioned this issue Mar 31, 2017

[Proposal] System-level filtering on docker containers #32245

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: System-level containers #18724

Proposal: System-level containers #18724

cpuguy83 commented Dec 16, 2015

cpuguy83 commented Dec 16, 2015

duglin commented Dec 16, 2015

LK4D4 commented Dec 16, 2015

cpuguy83 commented Dec 16, 2015

abronan commented Dec 17, 2015

bboreham commented Dec 17, 2015

cpuguy83 commented Dec 17, 2015

bboreham commented Dec 17, 2015

cpuguy83 commented Dec 17, 2015

bboreham commented Dec 17, 2015

bboreham commented Dec 21, 2015

cpuguy83 commented Dec 21, 2015

bboreham commented Dec 21, 2015

cpuguy83 commented Dec 21, 2015

jainvipin commented Dec 30, 2015

cpuguy83 commented Jan 6, 2016

duglin commented Jan 6, 2016

bboreham commented Jan 6, 2016

cpuguy83 commented Jan 6, 2016

bboreham commented Jan 18, 2016

Proposal: System-level containers #18724

Proposal: System-level containers #18724

Comments

cpuguy83 commented Dec 16, 2015

Problem

Proposed solution

cpuguy83 commented Dec 16, 2015

duglin commented Dec 16, 2015

LK4D4 commented Dec 16, 2015

cpuguy83 commented Dec 16, 2015

abronan commented Dec 17, 2015

bboreham commented Dec 17, 2015

cpuguy83 commented Dec 17, 2015

bboreham commented Dec 17, 2015

cpuguy83 commented Dec 17, 2015

bboreham commented Dec 17, 2015

bboreham commented Dec 21, 2015

cpuguy83 commented Dec 21, 2015

bboreham commented Dec 21, 2015

cpuguy83 commented Dec 21, 2015

jainvipin commented Dec 30, 2015

cpuguy83 commented Jan 6, 2016

duglin commented Jan 6, 2016

bboreham commented Jan 6, 2016

cpuguy83 commented Jan 6, 2016

bboreham commented Jan 18, 2016