What is Kubernetes?
With Kubernetes you can deploy cloud-native applications anywhere and manage them all exactly the same.
Kubernetes is open source software for deploying, scaling and managing containerized applications. As an orchestrator, it handles the work of scheduling containers on a cluster and also manages the workloads to ensure they run as you intended.
Because Kubernetes was designed from the beginning with the idea of software development and operations working together, operations tasks and how they get performed are an integral component of the Kubernetes architecture and design. Almost everything in Kubernetes uses declarative constructs that describe how applications are composed, how they interact and how they are managed. This enables a significant increase in the operability and portability of modern software systems. It also allows developers to easily adopt GitOps workflows into their development pipelines which increases the velocity and reliability of feature deployments.
Kubernetes Leads the Pack
The graph below shows the skyrocketing interest in the project. This data was compiled from questions asked on StackExchange as well as GitHub stars and forks across three leading orchestrators.
Benefits of Kubernetes
As the first Cloud Native Cloud Foundation (CNCF) project, Kubernetes is the fastest growing project in the history of Open Source software. It became popular for the following key reasons:
Kubernetes offers portability, and faster, simpler deployment times. This means that companies can take advantage of multiple cloud providers if needed and can grow rapidly without having to re-architect their infrastructure.
With Kubernetes ability to run containers on one or more public cloud environments, in virtual machines, or on bare metal means that it can be deployed almost anywhere. And because Kubernetes has fundamentally changed the way development and deployments are made, teams can also scale much faster than they could in the past.
Kubernetes addresses high availability at both the application and the infrastructure level. Adding a reliable storage layer to Kubernetes ensures that stateful workloads are highly available. In addition to this, the master components of a cluster can be configured for multi-node replication (multi-master) and this also ensures a higher availability.
Since Kubernetes is open source, you can take advantage of the vast ecosystem of other open source tools designed specifically to work with Kubernetes without the lock-in of a closed/proprietary system.
Proven, and Battle Tested
A huge ecosystem of developers and tools with 5,608 GitHub repositories and counting means that you won’t be forging ahead into new territory without help.
It was developed by and used and maintained by Google which not only gives it instant credibility, but can be trusted to fix bugs and release new features on a regular basis.
Kubernetes 101 Architecture
The way Kubernetes is architected is what makes it powerful. Kubernetes has a basic client and server architecture, but it goes way beyond that. Kubernetes has the ability to do rolling updates, it also adapts to additional workloads by auto scaling nodes if it needs to and it can also self-heal in the case of a pod meltdown. These innate abilities provide developers and operations teams with a huge advantage in that your applications will have little to no down time. In this section we provide a brief overview of the master and its worker nodes with a high level overview of how Kubernetes manages workloads.
(from The X team blog)
The Kubernetes master is the primary control unit for the cluster. The master is responsible for managing and scheduling the workloads in addition to the networking and communications across the entire cluster.
These are the components that run on the master:
- Etcd Storage – Is an open-source key-value data store that can be accessed by all nodes in the cluster. It stores configuration data of the cluster’s state.
- Kube-API-Server – The API server manages requests from the worker nodes, and it receives REST requests for modifications, and serves as a front-end to control cluster.
- Kube-scheduler – Schedules the pods on nodes based on resource utilization and also decides where services are deployed.
- Kube-controller-manager – It runs a number of distinct controller processes in the background to regulate the shared state of the cluster and perform routine tasks. When there is a change to a service, the controller recognizes the change and initiates an update to bring the cluster up to the desired state.
These nodes run the workloads according the schedule provided by the master. The interaction between the master and worker nodes are what’s known as the control plane.
- Kubelet – Kubelet ensures that all containers in the node are running and are in a healthy state. If a node fails, a replication controller observes this change and launches pods on another healthy pod. Integrated into the kubelet binary is ‘cAdvisor` that auto-discovers all containers and collects CPU, memory, file system, and network usage statistics and also provides machine usage stats by analyzing the ‘root’ container.
- Kube Proxy – It acts as a network proxy and a load balancer. Additionally, it forwards the request to the correct pods across isolated networks in a cluster.
- Pods - A pod is the basic building block on Kubernetes. It represents the workloads that get deployed. Pods are generally collections of related containers, but a pod may also only have one container. A pod shares network/storage and also a specification for how to run the containers.
- Containers – Containers are the lowest level of microservice. These are placed inside of the pods and need external IP addresses to view any outside processes.
Where to Run Kubernetes
What's the first step after you’ve decided to run your applications on Kubernetes? Where will you run the cluster and how are you going to run it?
There are a lot of choices to make when deciding on where to run your Kubernetes cluster. Much of which depend entirely on your specific requirements.
You may need to look at budget, not only in terms of money, but also in terms of time. How much time can you invest in setting up the cluster and more importantly in maintaining it?
You could have particular security requirements that prevent you from running on a public cloud. This would obviously severely limit the number of choices you have for running your cluster.
Do you have existing infrastructure? Does your company already have servers that need to run some of your infrastructure.
What about your data?
Do you have strict regulations that dictate where your data needs to stay, for example in a particular country?
On deciding where to run your your cluster, you need to take all of these factors into effect. If you can leverage a cloud-hosted provider then this is the most convenient route to take. In essence you are letting somebody else run your clusters as well as being able to take advantage of the different services available. This frees you to focus on your actual product and build business value instead of managing infrastructure.
Kubernetes Public Cloud Providers, On-Premise vs. PaaS or Private Cloud
The major cloud providers all have either managed Kubernetes solutions or the option to build your own cluster from scratch. Here’s a short overview with links to more in depth information on each cloud provider.
AWS vs. EKS
Running Kubernetes on Amazon gives you a large number of choices as far as run-time environments and services that you can take advantage of. You may opt to install and run Kubernetes yourself on AWS. A manual, self install will provide you with the most flexibility in terms of having access to AWS’ services.
But if you don’t want the overhead of having to install and configure Kubernetes yourself, you can use their managed Kubernetes service. With the managed service, the control plane is managed and maintained by Amazon; you are guaranteed a highly available cluster and also have access to the most popular services like CloudWatch and RDS.
- Kubernetes on AWS: Tutorial and Best Practices for Deployment
- Build Your Own Kubernetes Cluster: DIY or Installers (kops, kubeadm, kubicorn)
- Announcing eksctl – create EKS clusters with a single command
GCP, GKE vs. on-prem Kubernetes
An obvious reason to run Kubernetes on GCP is because Google is the creator of Kubernetes. Running your cluster on their platform might give you an edge, since you can take advantage of any new features more quickly.But while you may have features sooner on GCP, it is more of a closed system. However this can be an advantage for you, if what you want is an automated Kubernetes cluster without having to worry about manually provisioning servers.
Integrated Google Services with GKE.
GKE also integrates with all of Google’s other tooling and it comes with built-in logging, log management, and monitoring at both the host and container levels. It can also give you automatic auto-scaling, automatic hardware management, and automatic version updates. GKE in general gives you a production-ready cluster with a more “batteries-included” approach than if you were building everything from the ground up.
If you need to keep your data and your servers in-house, then you will need to install and update Kubernetes on bare metal servers yourself. There are of course pros and cons to this, an advantage is you have control over your data and servers, but then the cons are that it can be complex to set up and may need many people to maintain your infrastructure.
Weaveworks also offers Production Grade Kubernetes Support for enterprises. For the past 3 years, Kubernetes has been powering Weave Cloud, our operations as a service offering, so we couldn’t be more excited to share our knowledge and help teams embrace the benefits of cloud native tooling.
We focus on creating GitOps workflows, building from our own experiences of running Kubernetes in production. Our approach uses developer-centric tooling (e.g. git) and a tested approach to help you install, set-up, operate and upgrade Kubernetes.
PaaS, and Private Cloud Solutions
There are also a huge number of PaaS solutions as well as private cloud options which offer something in between a full locked in solution and complete freedom to choose the tools you need. Depending on your budget and if you don’t want to take advantage of OSS tools (in most cases), then one of these options may be something you are looking for.
Some of these options include:
Another essential step in your Kubernetes journey is building out your continuous integration and continuous delivery pipelines (CICD). In order to increase the velocity of your team, as well as reaping the other benefits of automation, you’ll need to think carefully about how you will make the transition as well what tools you want to use.
Why automate your pipeline?
- Reduce your time to market from weeks and months to days or hours. With an automated pipeline, development teams improve both the velocity of releases as well as the quality of the code. New features and fixes added continuously in small increments can result in a product with fewer defects.
- A stronger development team. Because all stages of the delivery pipeline are available for anyone on the team to examine, improve upon and verify, a sense of ownership over the build is created that encourages strong teamwork and collaboration across your entire organization.
- Reduced risks and costs of software development. Automation encourages developers to verify code changes in stages before moving forward, reducing the chances of defects ending up in production.
- Less work in progress. A CD pipeline provides a rapid feedback loop starting from development through to your customers. This iteration cycle not only helps you build the right product, but it allows developers to make product improvements more quickly, leaving less works in-progress to linger.
One of the difficulties in creating a pipeline is successfully gluing together the tools that you would like to use (both open source and closed source). When you are designing your deployment pipelines these are some of the main concerns:
- End-to-end security across the entire pipeline
- Ability to rollback with a fully reproducible audit trail
- Built-in observability and alerting
- A fast Mean Time to Deployment as well as a Fast Mean Time to Recovery
- Simple developer experience and workflows
- CICD for Kubernetes
- What CICD tool should I use?
- Top CICD Tools for Kubernetes
- Let’s do GitOps and Not CIOps
Best Practises for Developing Apps on Kubernetes
Because Kubernetes is so flexible, there are many different ways to accomplish the same task. When developing applications on Kubernetes these are the areas that you will need your team to manage:
Use Helm charts to package your applications, remember that downstream dependencies are mostly unreliable, don’t make your microservices too micro, use namespaces to split up your cluster, enable Role-Based Access Control on your cluster for your development team.
Don’t use type: LoadBalancer when the Kubernetes nodeport can be “good enough”, use static IPs, and map external services to internal ones.
There’s some common sense advice on building your containers that you should consider when developing apps. Things like: not trusting arbitrary base images, keeping base images small and using a builder pattern for static languages.
For things that you put inside your containers, there are also some mostly obvious best practices that include: using a non-root user, make the filesystem read-only, best to use one-process per container, always crash cleanly and don’t restart on failure, and of course, log everything to stdout and stderr.
- Top 5 Kubernetes Best Practices From Sandeep Dinesh (Google)
- Your Guide to a Production Ready Kubernetes
Monitoring and Observability
Monitoring is often something that gets left to the end. But it's actually an important piece that you shouldn’t leave to the end. To be effective, monitoring is something that your development team should be thinking about during the application design phase.
Monitoring is at the core of observability. Cloud native systems are distributed systems, with dozens of pieces across many servers across multiple regions. With this much complexity, you need think about how to monitor it upfront.
If some container goes down on one of your servers, is this something you need to care about? You may not know necessarily without context. You will need a monitoring tool that will provide you with this context.
Prometheus is the first choice, open source, cloud native monitoring tool that has a history and also provides good integration with Kubernetes. The Prometheus server has an internal database that stores metrics it has pulled from your services. Once your code is instrumented, it connects to the Kubernetes service discovery to find out everything that's running. This saves you from having to talk directly to Prometheus. Prometheus instead goes out to find your services.
The Future of Kubernetes
According to the CNCF, Kubernetes is now the second largest open source project in the world just behind linux.
Since the introduction of Kubernetes, you can safely say that almost all of the other orchestrators are either irrelevant or have taken a back seat to Kubernetes. And now just over four years later, every major public cloud provider has a managed Kubernetes service or is in the process of developing one.
Because of this huge uptake and adoption, you can be assured that Kubernetes is here to stay.
Need Help To Become Production Ready?
For the past 3 years, Kubernetes has been powering Weave Cloud, our operations as a service offering. We’re happy to share our knowledge and help teams embrace the benefits of on-premise or public cloud installations of Kubernetes.
Download our latest white paper and find out what production ready Kubernetes means, the cultural changes you need to make on your team, as well as the most important requirements to consider when using and taking advantage of Kubernetes in production.
Contact us for more details on our Kubernetes support packages or join our regular weekly online user group where we discuss and take questions on tools and methods for developing applications on Kubernetes.