Configuring Container Capabilities with Kubernetes
In this blog post, I discuss how to configure capabilities for Docker containers running on Kubernetes. Even though Linux containers run in their own isolated bubble, they aren’t just allowed to do anything they like. Some actions, such as...
Liquid Metal is Here: Supported, Multi-Cluster Kubernetes on micro-VMs and Bare Metal
You aren't Doing GitOps without Drift Detection
KubeCon and GitOpsCon EU, 2022 - Git Involved!
In this blog post, I discuss how to configure capabilities for Docker containers running on Kubernetes.
Even though Linux containers run in their own isolated bubble, they aren’t just allowed to do anything they like. Some actions, such as altering the network routing table, loading a kernel module or tracing what a process is doing, require that the administrator grant special permission to perform them.
You may be familiar with the idea of a privileged container, which can do anything it likes on the machine. But did you know that Linux has a whole set of finer-grained permissions? These are called capabilities, and they let you fine-tune the permissions granted so that you don’t inadvertently give total control to an unfamiliar image. Linux and Docker container capabilities can be enabled or disabled for each container – for the complete backstory, take a look at an Overview of Linux Capabilities.
For example, suppose you have a situation where you want to control the network routing table inside your container. For that scenario you will need to use the NET_ADMIN capability. Those capabilities are specified with the –cap-add option on the Docker command line, as follows:
docker run --cap-add=NET_ADMIN ubuntu:14.04
You can view a complete list of the supported capabilities in Docker containers and what they mean at Runtime privilege and Linux capabilities from Docker’s run Reference Documentation.
In a Kubernetes pod, the names are the same, but everything has to be defined in the pod specification. When implementing this in Kubernetes, you add an array of capabilities under the securityContext tag.
Here’s a complete example of a Pod specification that has NET_ADMIN capability:
apiVersion: v1 kind: Pod metadata: name: mypod spec: containers: - name: myshell image: "ubuntu:14.04" command: - /bin/sleep - "300" securityContext: capabilities: add: - NET_ADMIN
Copy this into a file called mypod.yaml and then ask Kubernetes to create the pod:
$ kubectl create -f mypod.yaml pod "mypod" created
We’ve used the ‘sleep’ command to tell the pod to stay open, while we run commands inside of it:
$ kubectl exec mypod -- capsh --print Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid, cap_setpcap,cap_net_bind_service,<b>cap_net_admin</b>,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+eip
If you delete the pod, remove the NET_ADMIN from the spec and then re-create the pod, where you will see that this extra capability is gone:
$ kubectl exec mypod -- capsh --print Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid, cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+eip
Too Many Capabilities?
If you look at that list and think “that’s a lot of capabilities“, then you may decide you want to cut down your runtime environment to the bare minimum that it needs. This is understandable, and if this is the case, you can use drop instead of add in the pod specification.
securityContext: capabilities: drop: - CHOWN - NET_RAW - SETPCAP
In this post, we have shown you how to explore the capabilities commands as they apply to containers running in Kubernetes.
(The image at the top of this post is of Compton Verney, a beautiful park landscaped by Lancelot “Capability” Brown)