Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new security-opt: privileged-without-host-devices (for safe DinD with Kata) #39702

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Aug 8, 2019

Signed-off-by: Akihiro Suda akihiro.suda.cz@hco.ntt.co.jp

- What I did

docker run --runtime=kata --privileged is insecure despite of Kata's
VM isolation because host devices are visible to the container. kata-containers/runtime#1568

This commit adds a new security-opt privileged-without-host-devices to
allow privileged mode without mounting host devices.
The daemon returns an error if the opt is specified but privileged is
not specified.

A common use-case of this is to run Docker-in-Docker securely with Kata.

Fixes #39697
Relates to containerd/cri#1225 cri-o/cri-o#2730

- How I did it

Added a new security-opt

- How to verify it

CLI: docker/cli#2037

Without privileged-without-host-devices

$ docker run -it --rm --runtime=kata --privileged alpine
/ # ls -l /dev/sda
brw-rw----    1 root     disk        8, 128 Aug  8 18:15 /dev/sda
/ # hexdump -C /dev/sda
(host disk is leaked)

With privileged-without-host-devices:

$ docker run -it --rm --runtime=kata --privileged --security-opt privileged-without-host-devices alpine
/ # ls -l /dev
total 0
crw--w----    1 root     tty       136,   0 Aug  8 18:26 console
lrwxrwxrwx    1 root     root            13 Aug  8 18:26 fd -> /proc/self/fd
crw-rw-rw-    1 root     root        1,   7 Aug  8 18:26 full
drwxrwxrwt    2 root     root            40 Aug  8 18:26 mqueue
crw-rw-rw-    1 root     root        1,   3 Aug  8 18:26 null
lrwxrwxrwx    1 root     root             8 Aug  8 18:26 ptmx -> pts/ptmx
drwxr-xr-x    2 root     root             0 Aug  8 18:26 pts
crw-rw-rw-    1 root     root        1,   8 Aug  8 18:26 random
drwxrwxrwt    2 root     root            40 Aug  8 18:26 shm
lrwxrwxrwx    1 root     root            15 Aug  8 18:26 stderr -> /proc/self/fd/2
lrwxrwxrwx    1 root     root            15 Aug  8 18:26 stdin -> /proc/self/fd/0
lrwxrwxrwx    1 root     root            15 Aug  8 18:26 stdout -> /proc/self/fd/1
crw-rw-rw-    1 root     root        5,   0 Aug  8 18:26 tty
crw-rw-rw-    1 root     root        1,   9 Aug  8 18:26 urandom
crw-rw-rw-    1 root     root        1,   5 Aug  8 18:26 zero
/ # ls -l /dev/sda
ls: /dev/sda: No such file or directory
/ # mknod /dev/sda b 8 128
/ # hexdump -C /dev/sda
hexdump: /dev/sda: No such device or address
hexdump: /dev/sda: Bad file descriptor

Verified with Kata 1.8.0

- Description for the changelog

new security-opt: privileged-without-host-devices

- A picture of a cute animal (not mandatory but encouraged)

@AkihiroSuda
Copy link
Member Author

@bergwolf PTAL?

Copy link
Contributor

@akhilerm akhilerm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A query on where we are restricting the /dev mount?

@akhilerm
Copy link
Contributor

@AkihiroSuda One more question. When a new device is attached to the host and an entry is created in /dev of the host, we should also restrict those new devices from getting created inside the container right?

I can see that already there is a bug in docker which does not update the /dev directory of the container, when new devices are attached. Not sure whether this will impact or not?

@AkihiroSuda
Copy link
Member Author

For Kata, new device wont be enabled because there is no mount for the device.

For non-kata, the flag should not be considered as a security boundary.

@akhilerm
Copy link
Contributor

For non-kata, the flag should not be considered as a security boundary.

Ohkay. Got it. But can these same functionality be extended to non-kata? Or is it used only to get past the VM isolation issue in kata.

@AkihiroSuda
Copy link
Member Author

But can these same functionality be extended to non-kata

It doesn't make sense. Privileged non-kata container can execute arbitrary command on the host anyway to access any device

@akhilerm
Copy link
Contributor

Ohkay!. 👍 Thanks for explaining.

@bergwolf
Copy link
Contributor

LGTM! Thanks @AkihiroSuda !

@AkihiroSuda
Copy link
Member Author

@justincormack @cpuguy83 @thaJeztah PTAL?

1 similar comment
@AkihiroSuda
Copy link
Member Author

@justincormack @cpuguy83 @thaJeztah PTAL?

@AkihiroSuda
Copy link
Member Author

cc @tianon WDYT?

@AkihiroSuda AkihiroSuda changed the title new security-opt: privileged-without-host-devices new security-opt: privileged-without-host-devices (for safe DinD with Kata) Sep 11, 2019
@cpuguy83
Copy link
Member

The option seems ok, but I'm not sure why someone would use privileged and also expect it to be secure?

@AkihiroSuda
Copy link
Member Author

@cpuguy83

dind with kata needs --privileged and is expected to be secure

@thaJeztah
Copy link
Member

Isn't this combination possible with #36644, or is that tweaking something else? (haven't looked in depth).

@AkihiroSuda
Copy link
Member Author

Unrelated, this one aims at preventing Kata from mounting host /dev completely

@tianon
Copy link
Member

tianon commented Sep 11, 2019

I get conceptually that this solves a problem Kata has, but I don't think I understand why this particular solution was chosen? We've long regarded the --privileged flag as somewhat of a mistake (certainly that in hindsight, could've had a better name), especially given the vast number of users who view it as sudo for containers (ala, if it doesn't work, try it again with --privileged). The number of users I've personally watched do exactly that blindly makes me very sad on a regular basis, so I'm confused why we'd be looking to create a second option, but with slightly less "remove all the protections" applied just for this one use case?

@thaJeztah
Copy link
Member

thaJeztah commented Sep 11, 2019

Yes, I was thinking; what privileges does --privileged provide that cannot be given in another way? (Wanted to make a table for that to point out the missing feature(s)).

If those missing options can be added, then instead of using --privileged ("all the things!") and a --but-not-fully-privileged flag, users could instead do, e.g.

--cap-add=all
--security-opt apparmor=unconfined
--security-opt seccomp=unconfined
--security-opt systempaths=unconfined 
--security-opt host-devices=unconfined

(start with "default", and add what's needed)

@AkihiroSuda
Copy link
Member Author

AkihiroSuda commented Sep 12, 2019

$ docker run -d --name dind --runtime=kata --security-opt apparmor=unconfined --security-opt seccomp=unconfined --security-opt systempaths=unconfined --cap-add all docker:19.03-dind
$ docker exec -it dind docker run -it --rm alpine
...
docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:281: applying
cgroup configuration for process caused \"mkdir /sys/fs/cgroup/cpuset/docker: read-only file system\"": unknown.
...

(Kata 1.8.0 with Moby e20b732)

after remounting cgroup as read-write before starting dockerd-entrypoint:

$ docker exec -it dind docker run -it --rm alpine
docker: Error response from daemon: cgroups: cannot find cgroup mount destination: unknown.
ERRO[0002] error waiting for container: context canceled

AkihiroSuda added a commit to AkihiroSuda/docker-library-docker that referenced this pull request Sep 12, 2019
…leged)

Docker-in-Kata had required `--privileged` but it ruins the benefit of
Kata because it mounts `/dev` from the host.

Now Docker-in-Kata can be launched without `--privileged`:

  $ docker run --runtime kata -e DOCKER_REMOUNT_SYS_RW=1 --cap-add all --security-opt seccomp=unconfined --security-opt systempaths=unconfined docker:dind

Tested with Kata Containers 1.8.0
(1.8.1 is broken: kata-containers/runtime#2047)

Alternative to moby/moby#39702

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

For DIND usecase, this DIND PR seems to work: docker-library/docker#191

$ docker run --runtime kata -e DOCKER_REMOUNT_SYS_RW=1 --cap-add all --security-opt seccomp=unconfined --security-opt systempaths=unconfined

I can close this PR unless there is still demand from Kata maintainers.

AkihiroSuda added a commit to AkihiroSuda/docker-library-docker that referenced this pull request Sep 12, 2019
…leged)

Docker-in-Kata can be launched with `--privileged` but it ruins the benefit of
Kata because it mounts `/dev` from the host.

Now Docker-in-Kata can be launched without `--privileged`:

  $ docker run --runtime kata -e DOCKER_REMOUNT_SYS_RW=1 --cap-add all --security-opt seccomp=unconfined --security-opt systempaths=unconfined docker:dind

Tested with Kata Containers 1.8.0
(1.8.1 is broken: kata-containers/runtime#2047)

Alternative to moby/moby#39702

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
AkihiroSuda added a commit to AkihiroSuda/docker-library-docker that referenced this pull request Sep 12, 2019
…leged)

Docker-in-Kata can be launched with `--privileged` but it ruins the benefit of
Kata because it mounts `/dev` from the host.

Now Docker-in-Kata can be launched without `--privileged`:

  $ docker run --runtime kata -e DOCKER_REMOUNT_SYS_RW=1 --cap-add all --security-opt seccomp=unconfined --security-opt systempaths=unconfined docker:dind

Tested with Kata Containers 1.8.0
(1.8.1 is broken: kata-containers/runtime#2047)

Alternative to moby/moby#39702

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

I'm keeping this PR open, as discussed in Kata ML: http://lists.katacontainers.io/pipermail/kata-dev/2019-September/001029.html

@thaJeztah
Copy link
Member

@dmcgowan @justincormack ptal

@AkihiroSuda
Copy link
Member Author

rebased

@AkihiroSuda
Copy link
Member Author

CRI-O adopted equivalent of this as well as containerd/CRI: cri-o/cri-o#2730

@AkihiroSuda
Copy link
Member Author

Aside from Kata, this PR turned out to be also useful for protecting the host console from getty running in a container: AkihiroSuda/containerized-systemd#5 (comment)

cc @cpuguy83

@cpuguy83
Copy link
Member

For the (mostly privileged) systemd case, instead of doing --privileged, I can also just remount proc as writable in the container and not have to deal with host /dev issues.

I do think it's slightly unfortunate that --privileged also mounts the host's /dev instead of leaving that to the client to decide if they actually want that.

But you can gradually add what you need instead of relying on --privileged, and that seems good enough.

`docker run --runtime=kata --privileged` is insecure despite of Kata's
VM isolation because host devices are visible to the container.

This commit adds a new security-opt `privileged-without-host-devices` to
allow privileged mode without mounting host devices.
The daemon returns an error if the opt is specified but privileged is
not specified.

A common use-case of this is to run Docker-in-Docker securely.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

rebased

Copy link
Contributor

@tao12345666333 tao12345666333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tianon
Copy link
Member

tianon commented Oct 6, 2021

I still think this is probably not the right UX here -- how long before someone asks for another privileged-without-X option because they have a different use case?

I agree with @thaJeztah that it would be really useful if we could make a table of features that --privileged can enable that can't be enabled any other way, but it'd definitely be a great exercise to at least start with the things Kata needs and work our way up from there.

@AkihiroSuda
Copy link
Member Author

Kata v2 no longer supports runc-style CLI, and Moby does not support non-runc runtimes, so closing.

@AkihiroSuda AkihiroSuda closed this Feb 7, 2022
@Vlad1mir-D
Copy link

@AkihiroSuda Could you please reopen this PR?
Current version of Kata works perfectly fine with the actual version of Moby but #39697 prevents usage of privileged mode.
ctr, nerdctl and other tools already allows to run Kata privileged without host devices.
Thanks!

@AkihiroSuda AkihiroSuda reopened this Apr 29, 2023
@tianon
Copy link
Member

tianon commented May 3, 2023

Re #39702 (comment), do you have a sense of which things Kata needs that are enabled by --privileged but cannot be enabled in any other way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

new security-opt: privileged-without-host-devices
10 participants