Creating Custom Kubernetes Operators

December 01, 2019

Continue reading to discover more regarding creating, building, and launching a Kubernetes Operator.

Related posts

Cloud-Native Logging And Monitoring Pattern

Environment Variables Configuration Pattern

Kubernetes Patterns: The Service Discovery Pattern

An Operator Or A Custom Controller?

When I first approached this topic, I had some confusion about whether an object that extends Kubernetes functionality should be referred to as an Operator or a Custom Controller. Digging deeper into the subject, I learned that both terms broadly refer to the same concept; using the modular and atomic nature of Kubernetes to create a controller that responds to specific events and executes custom actions as a result.

In another article, we created a controller of our own that responds to changes in ConfigMaps and restarts any Pods that are using this ConfigMap. The custom controller determines which Pods to restart by examining the annotations part of the ConfigMap, which contains the Pod label. In more advanced cases, we need our controller to respond to custom resources that are not native to Kubernetes. Thus, while a custom controller acts and listens on native Kubernetes resources and events, an Operator works with Custom Resource Definitions (CRDs), resources that the user creates to solve complex business problems. An Operator can be thought of as a replacement to the human operator. Someone who is aware of the technical details of the application and also knows the business requirements and acts accordingly.

What Does An Operator Bring To The Table?

An operator is a program that runs on a Pod and remains in constant connection with the API server. As mentioned earlier, an operator deals with CRDs. So, how does a typical Operator do and when should it be used over Custom Controllers? Let’s have a quick example.

Assume that your job as a Kubernetes administrator entails creating multiple MySQL database clusters. Naturally, there are minor differences between one cluster and the other like the number of nodes, the storage capacity, or even the MySQL image version. But in the end, you still need to create native resources like StatefulSets and Services and still configure them ‘roughly’ the same way.

Now, what if you have a resource that is called MySQLCluster? It uses its very own definition file that may look like this:

apiVersion: cr.mysqloperator.grtl.github.com/v1
kind: MySQLCluster
metadata:
  name: "my-cluster"
spec:
  secret: "my-secret"
  fromBackup: "my-backup-2017-12-14-01-22"
  port: 3306
  replicas: 2
  storage: "1Gi"
  image: "mysql:latest"

Now, how does Kubernetes know how to handle MySQLCluster? It’s not a native resource like Deployment, Service, or Pod. To instruct Kubernetes about how to interpret and work with this resource, we need to deploy an Operator that extends the API. Continuing with the above example, the Operator can be deployed using the following command:

kubectl run mysql-operator --image=grtl/mysql-operator:latest

Yes, it’s just a program running in a Pod. But, when it runs, it automatically registers a Custom Resource Definition (CRD) with the API server. This is how Kubernetes recognizes and knows how to handle the MySQLCluster resource.

When you apply the above definition, it works although there is no native Kubernetes resource called MySQLCluster. However, the resource will get created because there is a custom controller (the Operator) which is watching the API for objects of type MySQLCluster. Once it receives an event of interest, it triggers appropriate actions in response. It’s worth noting that the above example is taken from the GRTL mysql-operator. After finishing this article, I highly recommend that you have a look at how this operator was built on https://github.com/grtl/mysql-operator.

So, in our example, once this definition is applied, the operator creates several native Kubernetes resources like a StatefulSet, a service, a sidecar container, health and readiness probes, and so on. Perhaps by now, you may ask:

Isn’t That What Helm Charts Do?

Another source of confusion is when to use Helm Charts and when to use Operators since both of them automate Kubernetes object creation. In fact, Operators and Helm Charts are not alternatives to each other. They, rather complement each other. Back to our MySQL cluster example, the custom controller and the CRDs that make up the cluster can be deployed as part of a Helm Chart. An Operator does not just build native Kubernetes resources, it also has the necessary domain-specific knowledge to operate those resources in a way that enables the application to work properly. So, if we were to deploy our MySQL cluster through a Helm Chart (no Operator involved), we’ll be deploying a StatefulSet, a Persistent Volume, a Secret, a Service, etc. and that’s it. Helm is not aware of (and cannot handle) how those components interact with each other to form a functioning database cluster. Operators are used with stateful applications. Stateful applications (like databases, message queuing systems, caching systems, etc.) need a specific way of handling how they start, how they scale, and how they shut down. The Operator replaces the human operator who does those tasks manually while a Helm Chart handles deploying an Operator as part of other components more easily. Let’s have a look at the following diagram:

Creating_Custom_Kubernetes_Operators_1.png

Here, a Helm chart was used to deploy an environment. It availed Kubernetes native resources like Deployments, Jobs, the Docker image, and secrets. Assuming that this is the frontend part of the application, the backend requires a database cluster with tightly-coupled resources that work together as one unit. An Operator is responsible for creating this custom resource and managing it. It may also create multiple instances of this custom resource (think of multiple database clusters that interact with each other). The Operator here is part of the overall infrastructure and it is deployed through the Helm chart.

LAB: Building A Custom Kubernetes Operator

In this lab, we are going to build a special type of ConfigMap that contains the labels of the Pods that use it in its definition. Thus, we are creating a Custom Resource Definition (CRD). It is worth noting that we created a similar example using a Custom Controller in our article, Extending the Kubernetes Controller. However, there are some subtle differences between our implementation of this functionality using Custom Controllers and our current implementation using Operators. Let’s pause a little and explain how both methods differ.

Creating_Custom_Kubernetes_Operators_2.png

As you can see from the above illustration, the Custom Controller is continuously watching the API server for changes in a specific ConfigMap. Once it detects one, it extracts the Pod selector from the annotations part of the ConfigMap and uses it to restart the Pods that match the selection to reflect the changes.

When using an Operator on the other hand, you can create a Custom Resource Definition (CRD) that references a ConfigMap as well as the Pod selector for the affected Pods. This way, you are decoupling the ConfigMap from its Pods since you no longer need to add the labels to its annotation. Hence, you can use different ConfigMaps with different applications. This can be illustrated as follows:

3Creating_Custom_Kubernetes_Operators_3.png

Since this is a rather long lab, let’s break it into a number of phases, starting with the architecture.

The Architecture

Our Operator needs to connect to the API server to watch for changes in the CRD and, thus, determines which line of action it should take. There are two main ways you can connect to the Kubernetes API server; through a client library and through a Kubectl proxy. Many modern programming languages offer SDKs for connecting and dealing with Kubernetes. Some of them are officially supported by Kubernetes (needless to say, Golang has the richest library), and many others are community-supported. If your requirements are not complex, you may be better off using the kubectl proxy method. Using a simple kubectl command, you open a reverse proxy that listens on a port of your choice. Any HTTP requests arriving at that port are automatically relayed to the API server. While you are in less control using kubectl, you save yourself coding your way to connecting to the API.

In this lab, we’ll be using the kubectl proxy method. Since we’re deploying the Operator through a regular deployment (in the end, the Operator is just a program), we’ll have a Pod with two containers:

  • The Operator container: it contains the program that watches the API and detects the changes of interest.
  • The Ambassador container: it just runs kubectl proxy. The operator container uses it to connect to the API server. For more information about the Ambassador pattern, please refer to our article, The Ambassador Pattern.

The following diagram depicts this architecture:

Creating_Custom_Kubernetes_Operators_4.png

The Custom Resource Definition (CRD)

A CRD is created using a definition file of its own just like any other Kubernetes resource. Real-world Operators may bundle creating and registering the CRD within the Operator’s code. For example, the mysql-operator that was described earlier registers its CRD when it starts. Having the Operator handle CRD creation, it saves time and abstracts the process so that you can focus on the business needs. In this lab, we are creating the CRD manually to learn how it works. The definition file for our CRD may look like the following:

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: configmonitors.magalix.com
spec:
  scope: Namespaced
  group: magalix.com
  version: v1
  names:
    kind: ConfigMonitor
    singular: configmonitor
    plural: configmonitors
  validation:
    openAPIV3Schema:
      properties:
        spec:
          properties:
            configMap:
              type: string
              description: "Name of the ConfigMap to watch for changes"
            podSelector:
              type: object
              description: "Label selector used for selecting Pods"
              additionalProperties:
                type: string
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: config-monitor-crd
rules:
- apiGroups:
  - magalix.com
  resources:
  - configmonitor
  - configmonitor/finalizers
  verbs: [ get, list, create, update, delete, deletecollection, watch ]

Notice that the above snippet contains two definitions: the CRD and the RBAC role that it requires to be able to extend the API server. Let’s highlight the important aspects of the CRD definition:

Lines 2 and 3: the API version that enables you to add CRDs. CRDs are added to the cluster through a resource of their own. The resource type is CustomResourceDefinitinon.

Line 7: the scope defines whether this resource is available to the entire cluster or to the namespace where it lives. If a namespace is deleted, all the CRDs that are associated with it are deleted as well.

Line 8 and 9: the group and version define how the REST endpoint would be called. In our example, this is /apis/magalix.com/v1.

Lines 10 to 13: specify how we are going to call our new resource. A resource is called in three places:

  • The plural name is used in the API endpoint. In our example, that’d be /apis/magalix.com/v1/configminotors.
  • The singular name is the one used on the CLI, for example when using kubectl subcommands. It’s also used for displaying the results.

Lines 14 to 26: the remaining lines in the file are used to validate the resources that shall be created using this CRD. It uses the OpenAPI specification version 3. Validation is used to ensure that the correct field types and values are used. For example, the container parameter of a Pod definition is expecting an array of container objects. Adding a single string value would break things so validation ensures that if an incorrect value was entered, the API would refuse the definition and the resource is not created. In our case, we are expecting:

  • A configmap parameter: a string that specifies the name of the ConfigMap resource.
  • A podSelector parameter: this is a placeholder for the label(s) that would be used to select the Pods. The label itself is a string so the object is permitted to have one or more string values by specifying the additionalProperties parameter.

The second part of the file contains a definition for the role we’re using with the CRD. The role allows the CRD access to the verbs on the API group on which the resource is defined as well as on the finalizers part. Finalizers are used when you need to execute one or more actions before the resource is deleted. In our example, we didn’t specify any finalizer logic for our resource.

Let’s apply the above definition using kubectl:

$ kubectl apply -f crd.yml
customresourcedefinition.apiextensions.k8s.io/configmonitors.magalix.com created
role.rbac.authorization.k8s.io/config-monitor-crd created

The ConfigMonitor Resource

Now that our cluster recognizes resources of type ConfigMonitor, let’s create one. The definition file for this resource looks as follows:

apiVersion: magalix.com/v1
kind: ConfigMonitor
metadata:
  name: flakapp-config-monitor
spec:
  configmap: flaskapp-config
  podSelector:
    app: frontend

As you can see, the definition uses the fields that we specified in the CRD. We have the configmap name and the labels we’ll use to select the Pods in the podSelector part.

The Demo Application

Our demo application is just a Python flask API that responds with a message to HTTP requests. The message comes from a configuration file that will be later stored in a configmap. The application code looks as follows:

app/main.py:

from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello():
    app.config.from_pyfile('/config/config.cfg')
    return app.config['MSG']

if __name__ == "__main__":
    # Only for debugging while developing
    app.run(host='0.0.0.0', debug=True, port=80)

Let’s build the application and push it so that we can use it later:

docker build -t magalixcorp/flask:cusop
docker push magalixcorp/flask:cusop

The ConfigMap

Our application is a simple Python Flask API application that displays a message when its endpoint is hit. The message is read from a configuration file that we store in a ConfigMap. The ConfigMap definition looks as follows:

apiVersion: v1
kind: ConfigMap
metadata:
  name: flaskapp-config
data:
  config.cfg: 
    MSG="Welcome to Kubernetes!"

Apply the above definition the cluster using the kubectl apply command.

kubectl apply -f configmap.yml

So far we have all the components ready. Let’s create the Operator that will pull them together.

The Operator

The Operator is just a program (as mentioned before). It can be written in any programming language of your choice. Python, Ruby, Go, JavaScript, Java, or even bash can be used. The reason why Go is most widely used in writing Operators is because of its rich client library for Kubernetes, not because it’s the only way of building Operators. In our lab, we’ll be using Python. The application opens a continuous connection with the API server and watches for ConfigMap changes. Once a change is detected, it searches the resource of type configmonitors for a matching ConfigMap name. When found, the configmonitor provides the necessary labels by which the Operator can search for Pods through the podSelector parameter. The code file of our Operator looks as follows:

import requests
import os
import json
import logging
import sys

log = logging.getLogger(__name__)
out_hdlr = logging.StreamHandler(sys.stdout)
out_hdlr.setFormatter(logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(logging.INFO)
log.addHandler(out_hdlr)
log.setLevel(logging.INFO)


base_url = "http://127.0.0.1:8001"


namespace = os.getenv("res_namespace", "default")

# This is the function that searches for and kills Pods by searching for them by label


def kill_pods(labels):
    # We receive labels in the form of a list
    for label in labels:
        url = "{}/api/v1/namespaces/{}/pods?labelSelector={}".format(
            base_url, namespace, label)
        r = requests.get(url)
        # Make the request to the endpoint to retreive the Pods
        response = r.json()
        # Extract the Pod name from the list
        pods = [p['metadata']['name'] for p in response['items']]
        # For each Pod, issue an HTTP DELETE request
        for p in pods:
            url = "{}/api/v1/namespaces/{}/pods/{}".format(
                base_url, namespace, p)
            r = requests.delete(url)
            if r.status_code == 200:
                log.info("{} was deleted successfully".format(p))
            else:
                log.error("Could not delete {}".format(p))


# This function is used to extract the Pod labels from the configmonitor resource.
# It takes the configmap name as the argument and uses it to search for configmonitors
# that have the configmap name in its spec
def getPodLabels(configmap):
    url = "{}/apis/magalix.com/v1/namespaces/{}/configmonitors".format(
        base_url, namespace)
    r = requests.get(url)
    # Issue the HTTP request to the appropriate endpoint
    response = r.json()
    # Extract the podSelector part from each object in the response
    pod_labels_json = [i['spec']['podSelector']
              for i in response['items'] if i['spec']['configmap'] == "flaskapp-config"]
    result = [list(l.keys())[0] + "=" + l[list(l.keys())[0]]
              for l in pod_labels_json]
    # The result is a list of labels
    return result

# This is the main function that watches the API for changes


def event_loop():
    log.info("Starting the service")
    url = '{}/api/v1/namespaces/{}/configmaps?watch=true"'.format(
        base_url, namespace)
    r = requests.get(url, stream=True)
    # We issue the request to the API endpoint and keep the conenction open
    for line in r.iter_lines():
        obj = json.loads(line)
        # We examine the type part of the object to see if it is MODIFIED
        event_type = obj['type']
        # and we extract the configmap name because we'll need it later
        configmap_name = obj["object"]["metadata"]["name"]
        if event_type == "MODIFIED":
            log.info("Modification detected")
            # If the type is MODIFIED then we extract the pod labels by 
using the getPodLabels function
            # passing the configmap name as a parameter
            labels = getPodLabels(configmap_name)
            # Once we have the labels, we can use them to find and kill the Pods by calling the
            # kill_pods function
            kill_pods(labels)


event_loop()

The Operator Deployment

As you saw from the code file, our Operator is just a Python program. We are going to deploy it to the cluster by adding it to a Docker container and pushing it to the registry. We also need another container that acts as the Ambassador container.

The Operator Application Container

The Dockerfile for the application container looks as follows:

FROM python:latest
RUN pip install requests
COPY main.py /main.py
ENTRYPOINT [ "python", "/main.py" ]

We need to build and push this Dockerfile:

docker build -t magalixcorp/operator .
docker push magalixcorp/operator

The Deployment File

The sidecar or Ambassador container is just an image that contains kubectl. We’ll use it to run the kubectl proxy command. Combining both containers in a Deployment looks as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: operator
  labels:
    app: operator
spec:
  selector:
    matchLabels:
      app: operator
  template:
    metadata:
      labels:
        app: operator
    spec:
      containers:
      - name: proxycontainer
        image: lachlanevenson/k8s-kubectl
        command: ["kubectl","proxy","--port=8001"]
      - name: app
        image: magalixcorp/operator
        env:
          - name: res_namespace
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace

Notice that we are injecting the namespace as an environment variable to the Operator application. The namespace itself is brought from the Downward API, which is a Kubernetes feature that allows you to get information about the current object settings. For more information about this pattern and how to use it, please refer to our article ‘The Reflection Pattern’.

Recap Before We Launch

So far, we have a lot of components that we have built and deployed. Let’s have a quick recap of what we did before we go ahead and test our lab:

  • Application image: this is used by the Application Deployment Pod to start the application container.
  • ConfigMap: it contains the configuration file that is used by the application container.
  • Custom Resource Definition: this is the resource that extends the API, enabling Kubernetes to accept and work with configmonitors. A ConfigMonitor is a custom resource that has the name of the configmap and the labels of the pods that use this configmap.
  • ConfigMonitor: based on the CRD, we create a configmonitor object that has the configmap name and the pod labels.
  • The Operator application image: this is a Python application that contains the Operator logic.
  • The Operator Deployment: this is the deployment resource that runs the operator image. The Pod run by this deployment has two containers: the Operator container and the Ambassador (sidecar) container. The second one runs the kubectl proxy command that relays commands from the Operator container to the cluster’s API server.

Launching The Lab And Testing The Results

Let’s create a deployment for our demo application. The deployment ensures that the application Pod will get restarted when our operator deletes it as a result of a configmap change. The deployment definition for our demo application looks as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
  labels:
    app: frontend
spec:
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
      - name: app
        image: magalixcorp/flask:operator
        volumeMounts:
        - name: config-vol
          mountPath: /config
      volumes:
      - name: config-vol
        configMap:
          name: flaskapp-config

Let’s apply this deployment:

kubectl apply -f deployment.yml
deployment.apps/frontend created

We need to ensure that the application is functioning properly before going ahead of the lab. In a real-world scenario, you would create a service and possibly an ingress to allow your application to receive traffic. In our case, and since we already have a lot of components, let’s just login to the application container and issue the curl command against localhost:

kubectl get pods
NAME                        READY   STATUS    RESTARTS   AGE
frontend-7f8d89fb68-77qv4   1/1     Running   0          25s
operator-6d8464b567-v7sh4   2/2     Running   0          34m
$ kubectl exec -it frontend-7f8d89fb68-77qv4 -- bash
root@frontend-7f8d89fb68-77qv4:/app# curl localhost && echo
Welcome to Kubernetes!

OK, so our application responds with the message defined in the configmap. Let’s change the message and see whether or not our Operator will pick up the change and respond accordingly:

Let’s change the configmap so that it looks as follows:

apiVersion: v1
kind: ConfigMap
metadata:
  name: flaskapp-config
data:
  config.cfg: 
    MSG="Welcome to Operators!"

Let’s apply the configmap and see what happens to our application pod:

$ kubectl apply -f configmap.yml
configmap/flaskapp-config configured
$ kubectl get pods
NAME                        READY   STATUS    RESTARTS   AGE
frontend-7f8d89fb68-jgzbz   1/1     Running   0          4s
operator-6d8464b567-v7sh4   2/2     Running   0          38m

It’s clear that the frontend pod (the demo application) has been restarted. It is only 4 seconds old. But let’s double check by hitting the API and seeing the message:

kubectl exec -it frontend-7f8d89fb68-jgzbz -- bash
root@frontend-7f8d89fb68-jgzbz:/app# curl localhost && echo
Welcome to Operators!

So, the correct message is displayed. Finally, let’s have a look at the log messages that our Operator generated:

$ kubectl logs operator-6d8464b567-v7sh4 -c app
2019-09-27 19:44:08,958 Starting the service
2019-09-27 20:22:02,610 Modification detected
2019-09-27 20:22:02,648 frontend-7f8d89fb68-77qv4 was deleted successfully

TL;DR

  • Despite the power Kubernetes brings out of the box, it was designed in a way that makes it highly modular and extensible.
  • If you need to interact with the native Kubernetes resources like Pods, Services, ConfigMaps, etc. then you’d be just fine using a Custom Controller.
  • A Custom Controller is nothing but a program that subscribes and listens to the API server. When it detects an event of interest, it takes an action in response, possibly using the information found in the event data.
  • If your requirements are more complex, you may decide to create a combination of multiple native Kubernetes resources. This combination is referred to as Custom Resource Definition (CRD).
  • Using a CRD you can create a fully functional MySQL cluster that accepts configuration parameters like the number of nodes, the listening port, and so on. You deal with this cluster the same way you deal with other native resources.
  • A CRD needs a Custom Controller of its own to function properly. Only this time, it’s called an Operator. An Operator is a Custom Controller that works with CRDs.
  • Operators can be deployed as a part of a Helm Chart.
  • Operators can be written in any modern programming language. However, you’ll find that most Operators are written in Go due to its rich client library.

Related posts

Cloud-Native Logging And Monitoring Pattern

Environment Variables Configuration Pattern

Kubernetes Patterns: The Service Discovery Pattern

Whitepaper: Production Ready Checklists for Kubernetes

Download these comprehensive checklists to help determine your internal readiness and gain an understanding of the areas you should update

Download your Copy