Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phase 1 implementation of user namespaces as a remapped container root #12648

Merged
merged 6 commits into from Oct 10, 2015

Conversation

estesp
Copy link
Contributor

@estesp estesp commented Apr 22, 2015

This pull request is an initial implementation of user namespace support in the Docker daemon that we are labeling an initial "phase 1" milestone with limited scope/capability; which hopefully will be available for use in Docker 1.7.

The code is designed to support full uid and gid maps, but this implementation limits the scope of usage to a remap of just the root uid/gid to a non-privileged user on the host. This remapping is scoped at the Docker daemon level, so all containers running on a Docker daemon will have the same remapped uid/gid as root. See PR #11253 for an initial discussion on the design.

Discussion of future, possibly more complex, phases should be separate from specific design/code review of this "phase 1" implementation--see the above-mentioned PR for discussions on more advanced use cases such as mapping complete uid/gid ranges per tenant in a multi-tenant environment.

@mrunalp
Copy link
Contributor

mrunalp commented Apr 22, 2015

@estesp Nice! :)

@cpuguy83
Copy link
Member

Test fail.

@estesp
Copy link
Contributor Author

estesp commented Apr 22, 2015

@cpuguy83 I know how to let people down right out of the gate! :)

@estesp
Copy link
Contributor Author

estesp commented Apr 24, 2015

User namespace Patch TODOs:

  • fix VOLUME / -v permissions (2 integration-cli tests failing) - fixed in commit cf75a29
  • fix import of libcontainer/configs having non-cross buildable (e.g. Windows) syscalls referenced in namespaces.go - fixed in Split namespace syscall content for building on non-Linux docker-archive/libcontainer#554, as well as use of idtools.IDMap across Docker packages, mapped to configs.IDMap in native execdriver
  • need to mark --privileged as a conflict when Daemon is in user-namespaced mode - fixed in commit f24448c
  • fix pipe permissions errors in non-tty mode for stdout/stderr (see https://gist.github.com/estesp/e9d05f695ff2f45cefea)
  • make --root option and selection of lxc driver conflict - fixed in commit f24448c
  • add documentation from original PR - fixed with commit 03172ac
  • handle default option to --root: create default dockroot user and group if !exist, and use for root uid/gid remap values

context tarsum.TarSum // the context is a tarball that is uploaded by the client
contextPath string // the path of the temporary directory the local context is unpacked to (server side)
noBaseImage bool // indicates that this build does not start from any base image, but is being built from an empty file system.
defaultArchiver *archive.Archiver
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add one, thanks!

@diogomonica
Copy link
Contributor

This looks good, but I'm not familiar enough with the code to provide a deep review.

What do you think about using a nobody uid (or similar) if someone provides --root with no arguments?

@estesp
Copy link
Contributor Author

estesp commented Apr 25, 2015

@diogomonica thanks for the initial review! I believe I have answered all questions you posed; let me know if I generated any more questions with my answers :)

@diogomonica
Copy link
Contributor

I think the only one that remains unanswered is: should we support --root with no arguments? And if so, this flag probably should be renamed to root-remap

@diogomonica
Copy link
Contributor

@estesp ping! :)

@cpuguy83
Copy link
Member

I don't think we can support with no args unless it is a bool flag, also better to be explicit, I think.

@estesp
Copy link
Contributor Author

estesp commented Apr 27, 2015

@diogomonica this was discussed briefly on IRC and in the original "docs-only proposal" PR; @tiborvass had some thoughts there as well in case he wants to add them here.

In general I don't mind if Docker daemon has some "auto-remap" default, but as @cpuguy83 said I'm not sure the flag parsing code has the capability to handle no-arg if it isn't boolean, so we had discussed a "special value" like "dockroot", and if you specify that, it will create that uid and group if necessary and then use it as the root remapped value. E.g. --root=dockroot would get you a user on the host with a new UID value, add a group to the host, generating a new GID value, and then root in containers would map to this new UID:GID pair. If it already exists, then of course the pre-existing values would be used (e.g. on docker daemon restart with --root=dockroot provided again.)

@diogomonica
Copy link
Contributor

Actually, remapping by default to a random UID sounds better than explicitly having to set the --root.

Are we doing first release support for --root, second release make it the default?

@estesp
Copy link
Contributor Author

estesp commented Apr 27, 2015

@diogomonica I think we need broader design decision from @docker/maintainers on how this feature is exposed in 1.7 versus future releases. To me, if default == on: (a) requires we choose a special user/group as default root without user input, and (b) requires that we perform significant testing (prior to release) of "most popular" images from DockerHub with the kinds of images users will expect to "just work" out of the box.

I was leaning towards --root=uid:gid existing as an option because unless we have a strong opinion that our auto-selection is bulletproof for all situations, it never hurts to make it configurable for special cases we didn't think about and therefore would make this feature unavailable to end users simply because we don't allow it to be configured via user specification.

@jessfraz
Copy link
Contributor

ping @LK4D4 @icecrime

@estesp
Copy link
Contributor Author

estesp commented Oct 10, 2015

Never been quite as happy to see green in my life :)

@icecrime
Copy link
Contributor

Such green 💚 📗 🍏!

@thaJeztah
Copy link
Member

@thaJeztah
Copy link
Member

Thanks @estesp, amazing work! (And @jfrazelle for testing), "LGTM" for the docs (per #12648 (comment))

@icecrime
Copy link
Contributor

I would have loved giving @LK4D4 the honor, but let's not take the risk of an additional rebase.

LGTM! Congratulations and thanks for all the work @estesp ;-)

icecrime pushed a commit that referenced this pull request Oct 10, 2015
Phase 1 implementation of user namespaces as a remapped container root
@icecrime icecrime merged commit ed9434c into moby:master Oct 10, 2015
@vdemeester
Copy link
Member

🎉 congrats @estesp !!

@mavenugo
Copy link
Contributor

congrats @estesp 🎉

@ashahab-altiscale
Copy link
Contributor

Nice!
On Oct 10, 2015 11:39 AM, "Arnaud Porterie" notifications@github.com
wrote:

Merged #12648 #12648.


Reply to this email directly or view it on GitHub
#12648 (comment).

@bgruening
Copy link

Awesome!

@runcom
Copy link
Member

runcom commented Oct 10, 2015

Woaaaaa!!!!!!!

@mrunalp
Copy link
Contributor

mrunalp commented Oct 10, 2015

Awesome! :)

Sent from my iPhone

On Oct 10, 2015, at 11:39 AM, Arnaud Porterie notifications@github.com wrote:

Merged #12648.


Reply to this email directly or view it on GitHub.

@estesp estesp deleted the userns-impl branch October 10, 2015 22:35
@coolljt0725
Copy link
Contributor

Awesome! congrats @estesp

@ggmm2008
Copy link

ggmm2008 commented Jan 5, 2016

good

@innocentme1
Copy link

innocentme1 commented Jul 1, 2016

@thaJeztah @estesp

I am going through the docs of usernamespaces in which it says that one of the restriction is

A --readonly container filesystem (this is a Linux kernel restriction against remounting with modified flags of a currently mounted filesystem when inside a user namespace)

  1. I am confused here. Are you saying that I cannot host a volume from host machine in readonly mode to container with this user namespaces enabled (or) just read only container filesystem cannot be mounted?
  2. If its read-only container filesystem. Can you explain me little more elaborated the reason mentioned in the brackets why this is not possible?
  3. CIS Docker benchmark suggests mounting container's filesystem as read-only but if this is not possible. What's the workaround or are there any future plans in this?
  4. Also, when ever there is a change in container, it writes to a writable image layer on top of image layers right? where does the concept of read only container file system coming then (In this case, all changes are to be written in that top layer only right?) ?
    (Ref: https://docs.docker.com/engine/userguide/storagedriver/images/container-layers.jpg)

@innocentme1
Copy link

@thaJeztah Ping. Please let me know if you want me to elaborate anything.

@estesp
Copy link
Contributor Author

estesp commented Jul 2, 2016

@innocentme1 if you could send me a quick email at estesp @ gmail.com (so I have your email address) I'll be happy to write up answers to your questions after my vacation. This closed PR is probably not the best place to document, and if it turns out we can document these restrictions better, we can create some new PRs to do so or add it to some other useful location. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet