Kubernetes RBAC 101
For an overview, implementation information, and specific details regarding Kubernetes RBAC, check out this blog post!
RBAC is a security design that restricts access to valuable resources based on the role the user holds, hence the name role-based. To understand the importance and the need of having RBAC policies in place, let’s consider a system that doesn’t use it. Let’s say that you have an HR management solution, but the only security access measure used is that users must authenticate themselves through a username and a password. Having provided their credentials, users gain full access to every module in the system (recruitment, training, staff performance, salaries, etc.). A slightly more secure system will differentiate between regular user access and “admin” access, with the latter providing potentially destructive privileges. For example, ordinary users cannot delete a module from the system, whereas an administrator can. But still, users without admin access can read and modify the module’s data regardless of whether their current job entails doing this.
If you worked as a Linux administrator for any length of time, you appreciate the importance of having a security system that implements a security matrix of access and authority. In the old days of Linux and UNIX, you could either be a “normal” user with minimal access to the system resources, or you can have “root” access. Root access virtually gives you full control over the machine that you can accidentally bring the whole system down. Needless to say that if an intruder could gain access to this root account, your entire system is at high risk. Accordingly, RBAC systems were introduced.
In a system that uses RBAC, there is minimal mention of the “superuser” or the administrator who has access to everything. Instead, there’s more reference to the access level, the role, and the privilege. Even administrators can be categorized based on their job requirements. So, backup administrators should have full access to the tools that they use to do full, incremental, and differential backups. But they shouldn’t be able to stop the webserver or change the system’s date and time, for example.
Kubernetes Implementation Of RBAC
Now that you know what RBAC is and why it needs to be used in any computer system that has valuable resources let’s see how Kubernetes implements it. But before we can discuss RBAC in Kubernetes, we need to create a user for testing.
Creating a Kubernetes User Account Using X509 Client Certificate
If you have a Kubernetes cluster in place for testing purposes, you’re most probably the cluster owner or the administrator. Let’s create a user account for our labs. Kubernetes supports several user authentication methods. It also supports combining more than one to authenticate a user. If one of the chained methods fails, the user is not verified. In this example, we’ll use only one authentication method, the X509 certificate to create a user account called Magalix.
First, we need to create the client key (you’ll need OpenSSL command installed on your system):
openssl genrsa -out magalix.key 2048
Then, we need to create a certificate signing request:
openssl req -new -key magalix.key -out magalix.csr -subj "/CN=magalix"
Notice that the CN part of the subject must contain the username.
Since we have the user key and the signing request, we can use the cluster certificate and certificate key to sign this request:
openssl x509 -req -in magalix.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out magalix.crt -days 300
Note For Local And Self-Hosted Clusters Users
For the above command to run, you need to have the cluster certificate (ca.crt) and the certificate key (ca.key). If you’re using minikube, you’ll find those files under ~/.minikube/. In my case, I was using Docker for Desktop on macOS. I had to use the following commands to get the certificate and the file:
kubectl cp kube-apiserver-docker-for-desktop:run/config/pki/ca.crt -n kube-system ca.crt kubectl cp kube-apiserver-docker-for-desktop:run/config/pki/ca.key -n kube-system ca.key
For other self-hosted clusters, you can find the path to the certificate and the key under /etc/kubernetes/manifests/kube-apiserver.manifest.
Let’s add the user’s credentials to our kubeconfig file:
kubectl config set-credentials magalix --client-certificate=magalix.crt --client-key=magalix.key
We’ll also create a context for this user and associate it with our cluster:
kubectl config set-context magalix-context --cluster=docker-for-desktop-cluster --user=magalix
If you have a look at the ~/.kube/config file you should see the new user data added to the file.
Let’s test what level of privilege this user has:
$ kubectl --user=magalix get pods Error from server (Forbidden): pods is forbidden: User "magalix" cannot list pods in the namespace "default"
If you try the above command with other Kubernetes like get deployments or get jobs, you will find that the user does not have any privileges. We need to start giving it the necessary authorization. A good security practice is to grant users the access level that they need to execute their jobs only.
Authorization Using Kubernetes Roles
A Role in Kubernetes is as a Group in other RBAC implementations. Instead of defining different authorization rules for each user, you attach those rules to a group and add users to it. When users resign, for example, you only need to remove them from one place. Similarly, when a new user joins the company or gets transferred to another department, you need to change the roles they’re associated with.
Let’s create a role that enables our user to execute the get pods command:
kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: get-pods rules: - apiGroups: ["*"] resources: ["pods"] verbs: ["list"]
Let’s apply this definition using kubectl command:
kubectl apply -f role.yaml
Now, we have a role that enables its users to list the pods on the default namespace. But, in order for the magalix user to be able to execute the get pods, it needs to get bound to this role.
Kubernetes offers the RoleBinding resource to link roles with their objects (for example, users). Let’s modify the role.yaml file to look as follows:
kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: get-pods rules: - apiGroups: ["*"] resources: ["pods"] verbs: ["list"] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: magalix-get-pods subjects: - apiGroup: "" kind: User name: magalix roleRef: apiGroup: "" kind: Role name: get-pods
Although it’s perfectly possible to have the RoleBinding resource in a different file than the Role, it is considered a good Kubernetes practice to group similar or interdependent resources in the same file for easier management. Let’s apply this definition now:
kubectl apply -f role.yaml role.rbac.authorization.k8s.io/get-pods created rolebinding.rbac.authorization.k8s.io/magalix-get-pods created
Now, let’s see if magalix is able to list the pods on the cluster:
kubectl --user=magalix get pods NAME READY STATUS RESTARTS AGE hostpath-pd 1/1 Running 0 2d mysqlclient 0/1 CrashLoopBackOff 75 6h
By looking at the above output, we can see that we have a pod, mysqlclient that apparently have issues. Assuming that we no longer need this pod, let’s delete it:
$ kubectl --user=magalix delete pods mysqlclient Error from server (Forbidden): pods "mysqlclient" is forbidden: User "magalix" cannot delete pods in the namespace "default"
As you can see, the user is not able to delete the pods, yet it was able to list them. To understand why this behavior happened, let’s have a look at the get-pods Role rules:
- The apiGroups is an array that contains the different API namespaces that this rule applies to. For example, a Pod definition uses apiVersion: v1. In our case, we chose [*], which means any API namespace.
- The resources is an array that defines which resources this rule applies to. For example, we could give this user access to pods, jobs, and deployments.
- The verbs in an array that contains the allowed verbs. The verb in Kubernetes defines the type of action you need to apply to the resource. For example, the list verb is used against collections while "get" is used against a single resource. So, given the current access level granted to magalix, a command like kubectl --user=magalix get pods hostpath-pd will fail while kubectl --user=magalix get pods will get accepted. The reason is that the first command used the get verb because it requested information about a single pod. For more information about the different verbs used by Kubernetes check the official documentation.
Let’s assume that we need magalix to have read-only access to the pods, both as a collection and as a single resource (get and list verbs). But we don’t want it to delete Pods directly. Instead, we grant it access to the Deployment resource and, through Deployments, it can delete and recreate pods (like though rolling updates). A policy to achieve this may look as follows:
kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: get-pods rules: - apiGroups: ["*"] resources: ["pods"] verbs: ["list","get","watch"] - apiGroups: ["extensions","apps"] resources: ["deployments"] verbs: ["get","list","watch","create","update","patch","delete"] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: magalix-get-pods subjects: - apiGroup: "" kind: User name: magalix roleRef: apiGroup: "" kind: Role name: get-pods
We made two changes here:
- Added the get and watch to the allowed verbs against Pods.
- Created a new rule that targets Deployments and specified the necessary verbs to give the user full permissions.
Now, let’s test the different actions that our user is allowed or not allowed to do:
First, we create a simple Nginx deployment file:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.7.9 ports: - containerPort: 80
The deployment creates three Pods each hosting one container running Nginx web server.
$ kubectl --user=magalix apply -f deployment.yaml deployment.apps/nginx-deployment created
Once the deployment is created, let’s list the Pods:
$ kubectl --user=magalix get pods NAME READY STATUS RESTARTS AGE hostpath-pd 1/1 Running 0 2d mysqlclient 0/1 CrashLoopBackOff 85 6h nginx-deployment-75675f5897-hh6d5 1/1 Running 0 7s nginx-deployment-75675f5897-mh8xl 1/1 Running 0 7s nginx-deployment-75675f5897-skqjm 1/1 Running 0 7s
Get a single Pod:
kubectl get pods nginx-deployment-75675f5897-hh6d5 NAME READY STATUS RESTARTS AGE nginx-deployment-75675f5897-hh6d5 1/1 Running 0 34s
What about deleting this pod?
$ kubectl --user=magalix delete pods nginx-deployment-75675f5897-hh6d5 Error from server (Forbidden): pods "nginx-deployment-75675f5897-hh6d5" is forbidden: User "magalix" cannot delete pods in the namespace "default"
OK so we cannot directly delete Pods. But we should be able to replace them by modifying the Deployment definition to look like the following:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: apache spec: replicas: 3 selector: matchLabels: app: apache template: metadata: labels: app: apache spec: containers: - name: apache image: httpd ports: - containerPort: 80
The change we made was replacing the nginx image with httpd (Apache), and modifying the Pod labels and the Pod selector accordingly. This change effectively deletes the existing Pods that use nginx and replace them with new ones hosting containers running Apache:
$ kubectl --user=magalix apply -f deployment.yaml deployment.apps/nginx-deployment configured $ kubectl --user=magalix get pods NAME READY STATUS RESTARTS AGE hostpath-pd 1/1 Running 0 2d mysqlclient 0/1 CrashLoopBackOff 85 6h nginx-deployment-75675f5897-mh8xl 1/1 Terminating 0 3m nginx-deployment-75675f5897-skqjm 1/1 Running 0 3m nginx-deployment-c4f6cdd64-8w92k 1/1 Running 0 6s nginx-deployment-c4f6cdd64-brnmd 0/1 Pending 0 1s nginx-deployment-c4f6cdd64-dj9wr 1/1 Running 0 3s
As you can see, the new deployment definition is actively replacing Pods.
Notice that in all the preceding examples, we didn’t specify a namespace, so our Role is applied to the default namespace. A Role is bound to the namespace defined in its configuration. So, if we changed the metadata of our Role to look like this:
kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: name: get-pods namespace: web
, the magalix user wouldn’t have access to the pods or the deployments unless working in the web namespace.
But, sometimes you need to specify Roles that are not bound to a specific namespace but rather to the cluster as a whole. That’s when the ClusterRole comes into play.
Cluster-Wide Authorization Using ClusterRoles
ClusterRoles work the same as Roles, but they are applied to the cluster as a whole. They are typically used with service accounts (accounts used and managed internally by the cluster). For example, the Kubernetes External DNS Incubator project uses a ClusterRole to gain the necessary permissions it needs to work. The External DNS Incubator can be used to utilize external DNS servers for Kubernetes service discovery. The application needs read-only access to Services and Ingresses on all namespaces, but it shouldn't be granted any further privileges (like modifying or deleting resources). The ClusterRole for such an account should look as follows:
apiVersion: v1 kind: ServiceAccount metadata: name: external-dns --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: external-dns rules: - apiGroups: [""] resources: ["services"] verbs: ["get","watch","list"] - apiGroups: ["extensions"] resources: ["ingresses"] verbs: ["get","watch","list"] --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: external-dns-viewer roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: external-dns subjects: - kind: ServiceAccount name: external-dns
The above definition contains three definitions:
- A service account to use with the container running the application.
- A ClusterRole that grants the read-only verbs to the Service and Ingress resources.
- A ClusterRoleBinding works the same as a RoleBinding but with ClusterRoles. The subject here is ServiceAccount rather than User, and its name is external-dns.
Another everyday use case with ClusterRoles is granting cluster administrators different privileges depending on their roles. For example, a junior cluster operator should have read-only access to resources to get acquainted; then more access can be granted later on.
- Kubernetes uses RBAC to control different access levels to its resources depending on the rules set in Roles or ClusterRoles.
- Roles and ClusterRoles use API namespaces, verbs, and resources to secure access.
- Roles and ClusterRoles are ineffective unless they are linked to a subject (User, serviceAccount...etc) through RoleBinding or ClusterRoleBinding.
- Roles work within the constraints of a namespace. It would default to the “default” namespace if none was specified.
- ClusterRoles are not bound to a specific namespace as they apply to the cluster as a whole.