What is a Kubernetes Cluster?
A primer for anyone considering cloud native application infrastructure, in which we explain the basics of Kubernetes and the clusters it is designed to manage.
Webinar: Achieving Record Growth While Reducing Costs: Dream11 Explains How
Weaveworks product highlights from GitOps Days 2021
Unlock Developer Productivity and Operational Efficiency with Weaveworks
Kubernetes is an open source system for managing container-based applications – and if you’re reading this article, you probably already know that containers represent the most popular approach to cloud application architecture. But let’s assume that’s all you know, and take a moment to walk through what containers are, what Kubernetes is and from there, cover off Kubernetes clusters.
First of all, what is a container?
Cloud native applications are built to run in the cloud, as opposed to running on a single server or collection of servers. In the early days of cloud architecture, running in the cloud meant running on virtual machines. But running a lot of virtual machines can be costly in terms of computing resources, because every VM includes all the components necessary for the machine to function - a file system, a CPU (or at least a share of the underlying hardware’s CPU) and memory, right down to the operating system. Replicating all these components many times incurs a high cost.
Containers were subsequently introduced as a replacement for virtual machines that could reduce that cost. Instead of replicating all those components, containers share a single copy of the operating system, which results in a much lighter footprint. And because in the cloud you typically pay only for the resource you use, this means they cost less to run. Containers are also decoupled from the underlying infrastructure, which means they are more portable. They can easily be moved from one environment to another – even between different clouds.
Containers are of limited use on their own, however. Their power comes from the capabilities of the orchestration system. Its role is to monitor the node machines on which the containers run, shut down any containers that have ceased to run correctly and when that happens, replace them instantly with new, healthy containers – all without compromising the stability of the running application. By dealing with the complexities of scaling and failover automatically, Kubernetes has taken its place as the world’s most popular container orchestration system.
Okay, so… what is a Kubernetes cluster?
A Kubernetes cluster is a collection of linked node machines. These are the machines on which the containers run. They are virtual machines if the cluster is running in the cloud, though they can be physical machines if the cluster is run on-premise.
Every cluster includes at least one master node and, for production, at least one worker node (but ideally a minimum of three). As the primary control unit for the cluster, the master node handles the Kubernetes control plane – the environment via which the worker nodes interact with the master node. The control plane exposes the Kubernetes API so the nodes and the containers that host your application can be managed by Kubernetes.
Each worker node hosts one or more pods – a collection of containers under Kubernetes’ control. The various workloads and services that make up your cloud application run in these containers. Crucially however, the containers are not tied to their node machines. Kubernetes can move them around the cluster if necessary to maximise stability and efficiency.
As well as Kubernetes’ auto-scaling and self-healing properties, a key benefit of this architecture is portability. The pods and containers that hold the components of your application will run on any Kubernetes cluster that has been configured in the same way. This means they can be easily moved, duplicated or deleted and rebuilt. With the right operations software and processes in place, such as Weave GitOps, whole clusters can be rolled back or destroyed and recreated in minutes, drastically reducing mean time to recovery (MTTR).
The key components of a Kubernetes cluster
The components of a cluster can be grouped according to whether they run on the master node or the worker nodes. The master node is sometimes referred to simply as the control plane. It maintains the desired state of the cluster, covering which applications are running and which container images their various components are running on.
Source: Kubernetes documentation at kubernetes.io
Master node components
- Etcd Storage is a key-value data store that can be accessed by all nodes in the cluster. It stores configuration data about the cluster’s state.
- Kube-API-Server responds to requests from the worker nodes. It receives REST requests for modifications and also serves as a front-end to control cluster.
- Kube-scheduler monitors resource utilization across the cluster and schedules pods of containers accordingly. It also decides where services will be deployed.
- Kube-controller-manager runs a number of distinct controller processes in the background, regulating the shared state of the cluster and performing routine tasks. When there is a change to a service, the controller recognizes the change and initiates an update to bring the cluster up to the desired state.
Worker node components
- The Kubelet monitors all containers in the node, continually checking they are running and that they remain in a healthy state. Integrated into the kubelet binary is a piece of software called cAdvisor, that auto-discovers all containers and collects CPU, memory, file system, and network usage statistics and also provides machine usage stats by analyzing the ‘root’ container.
- Kube Proxy acts as a network proxy and a load balancer, while also forwarding requests to the correct pods.
- As a collection of related containers, a pod is the basic Kubernetes building block. A pod shares network/storage between its containers and also a specification for how to run the containers. Each pod typically has an internal cluster IP address.
- Containers represent the smallest unit. They live inside pods and need external IP addresses to view outside processes.
What is Kubernetes cluster management?
Cluster management is the process of managing a Kubernetes cluster from the moment of deployment onwards. Key to the success of Kubernetes has been its ability to automate much of this work – and central to that ability is the principle of a ‘desired state’.
Kubernetes is a declarative system, which means that rather than issue specific instructions, you provide it with information that describes the desired state of the cluster, usually in the form of one or more YAML files. Kubernetes will then manage the cluster automatically. If it detects a change – if a container fails, for example – it will perform the necessary actions to return the cluster to that desired state.
For operations teams, managing Kubernetes clusters in this way still requires a reasonable amount of input, however. To take the next step in automation of a cluster, many organizations are now turning to GitOps.
Endorsed by the Cloud Native Computing Foundation (CNCF), GitOps is an operations model based on a collection of software and processes. It builds on the declarative nature of Kubernetes to automate much more. To do this, it enlists the help of a version control system – usually Git, hence the name – in which to store the YAML files. With the desired state of the cluster described in a Git repository, a software agent can monitor the entire cluster – nodes, pods, containers and workloads – and alert operations personnel the moment any deviation from the desired state is detected.
GitOps has become popular because it allows operations teams to give developers more autonomy, safe in the knowledge that breaking things becomes much more difficult. It also frees operations personnel from repetitive, mundane tasks (such as deploying new versions of services) giving them time to focus on more valuable DevOps work.
The easiest way to get started with GitOps
GitOps is a collection of open source software, however commercially supported enterprise packages are available from vendors such as Weaveworks.
The quickest and easiest way to evaluate GitOps is to use the free Weave GitOps Core product. Designed to make GitOps as accessible as possible, it can be deployed by typing just two simple lines into a terminal.
To learn more about how GitOps can help you manage Kubernetes more effectively, or to download the software today, visit the Weave GitOps Core product page.
To learn more about putting GitOps to work in the enterprise, consider Weave GitOps Enterprise.