Cluster Management at Weave
To manage our SaaS, we decided on the following set of requirements to both guide us through the myriad of tools and also to minimize our team’s cognitive load:
- Keep the “state” for a given layer in version control (a Git repo) so that we can compare the state of the system to the version controlled state and determine if anything was missing or wrong, and if something broke, provide the ability to roll back to a previously known-good state.
- Share as much configuration between environments as possible. Ideally the tools would support some concept of parametrizable “modules” that can be “instantiated” for each cluster, and allow for the differences between clusters to be minimally expressed yet still easily audited.
- Ability to alert us if the state of a cluster diverged from the configuration.
How Weaveworks Organizes its Cluster
To help us manage and understand our SaaS, we divided our infrastructure into the following three basic layers:
Infrastructure Layers in Weave Cloud
Below, we will describe the responsibilities of each layer to the layers above it, and the tools we use to manage each of them.
At the bottom of this diagram is the infrastructure layer. This is where we group items that are responsible for provisioning and maintaining the VMs, as well as networking, security groups, ELBs, RDS instances, DynamoDB tables, S3 buckets, IAM roles etc.
Terraform provisions our infrastructure resources
We chose Terraform, from Hashicorp, to provision our infrastructure resources. Terraform allows us to declaratively list the resources we want in a high level configuration language, describe the interdependencies between them, and intelligently apply changes to this config in an order that honors the dependencies.
Terraform supports parameterizable, instantiable modules such that we can share the resource declarations for VMs etc between clusters, parameterizing each cluster for number of minions, instance type etc. Terraform even has a “plan” mode, which shows us if our configuration matches reality. We combine this with prom-run, which runs a command and exports its exit codes to Prometheus in Weave Cloud, to build a terradiff job that we monitor and alert on.
The Infrastructure layer exports a limited interface to the next layer up, including the VM’s public and private hostnames and IP addresses, the SSH key and the Pod CIDRs for each machine. This separation is not only good engineering practice, but it also allows us to make isolated changes to different layers in the stack without risking the entire stack.
The Kubernetes layer is responsible for provisioning and maintaining the various components that make up Kubernetes – the Docker engine, the Kubelet, the Kube-proxy, the API Server, the Scheduler, the Controller Manager, and etcd.
We evaluated tools such as Chef, Puppet, Salt etc to manage this layer, but in the end, we chose Ansible:
- it is master-less, which means we don’t have to maintain a separate system to manage the ansible master
- it is agentless which reduces any necessary bootstrapping
By using a combination of the -C flag (check mode) and the -D flag (diff mode), Ansible shows us where the live system differs from the checked in config. We use this in combination with `prom-run` to build an `ansiblediff` job—this layer's equivalent of the `terradiff` job. When reality diverges from the expected configuration, alerts get sent to our Slack channel so that we can respond right away.
This layer is probably the least well-defined in our stack; it also contains a number of other unrelated tasks such as:
- formatting the ephemeral disks that come with our VMs and then mounting them in a place where Docker and Kubernetes can use them—something that was surprising easy with Ansible’s LVM modules.
- managing our own certificate authority and issuing individual, role-based certificates for each of the Kubernetes components and clients.
The rest of the Kubernetes components (API server, scheduler, controller manager, etcd and kubeproxy) are run as Static Pods—Pods where the config live on disk on the node where they run, and where the kubelet is responsible for ensuring the running Pod matches the one on the disk’s config.
The configuration for Static Pods is also version controlled and this allows us to easily modify the command line flags for these components and then have ansiblediff alert us in Slack if we forget to deploy it to every node.
The application layer describes all of the Kubernetes Deployments, Services, DaemonSets in YAML files that are of course version-controlled in Git. We’ve also written a tool called kubediff which ensures that the checked in configuration doesn’t differ from what’s currently running.
We treat the configurations for services like Kube-DNS or the Kubernetes Dashboard the same way we treat our application microservices and use our own tools for Continuous Deployment and monitoring to manage them.
For more details on how Weave sets up and manages Kubernetes on AWS, you can read about it at “Provisioning and Lifecycle of a Production Ready Kubernetes Cluster”.
Managing the Weave Cloud Cluster
We use GitOps practices at Weaveworks to manage our cluster. GitOps means the entire state of our system is stored in version control and in our case, the state is checked and enforced by three different jobs that correspond to the three different layers:
- Terradiff — monitors differences between Version Control and production for components in the Infrastructure layer
- Ansiblediff — monitors differences between Version Control and production for components in the Kubernetes layer
- Kubediff — monitors differences between version control and production for the services in the Application layer
All the configuration files for the cluster are version controlled, and modifications to any component may be rolled out with zero downtime and rolled back easily if they break anything.
Take a look at our on demand webinar "Kubernetes and AWS - a perfect match for Weave Cloud". Chris Hein (Solutions Architect for AWS) and Stuart (Director of Product for Weaveworks) are discussing how we have been running Weave Cloud on AWS for the past 18 months in production and at scale.