How many Kubernetes ReplicaSets are in your cluster?
Are your Kubernetes ReplicaSets slowing you down? With a quick little clean up, our CPU load went down by 10%! Here's a quick overview of how you can check how many you have running, set a revision limit and even request a rollback.
Did you know, your Kubernetes cluster is most likely keeping dozens or hundreds of objects as history of what has been rolled out?
To start, run this little command:
kubectl get rs --all-namespaces | wc -l
At Weaveworks, we had over 2,500 in our development cluster. They don’t cost much just sitting there in Kubernetes’ etcd store, but if you run monitoring or continuous validation tools, they may be slowing you down. When we cleaned out all the old ReplicaSets, our CPU load went down around 10%!
How does this come about? If you use Deployments to manage your Kubernetes workloads, and most people do, then they will leave behind one ReplicaSet for each change you make. This is how rolling updates work - the Deployment creates a new ReplicaSet then gradually scales up the new one and scales down the old to the new until they are all on the new version. But, by default it keeps all of the old ones around as history!
Run this to see the history - it shows the commands used to do each change, and you can use this history to request a roll-back or undo.
kubectl rollout history deployment <some-deployment>
At Weaveworks we do GitOps - the master store for all modifications to a cluster is a Git repository, so we can naturally see all the history there and roll back to any point in time. And even if you prefer to have Kubernetes holding the history, you should set some limit so you don’t get thousands of them building up over time.
Here’s how to set a revision history limit, in the yaml definition of a Deployment:
kind: Deployment spec: replicas: 3 revisionHistoryLimit: 5 template: spec: ...
This will make Kubernetes keep the current version and five historical versions and clear out any older data, saving space in etcd and saving time and space in your monitoring infrastructure.
UPDATE: I was looking through the new features in Kubernetes 1.9, and I realised this issue is already fixed, though only if you move to the newer resource types. Specifically
apiVersion: extensions/v1beta1 does not set a default, while
apps/v1beta1 will default to 2 and in Kubernetes 1.9
apps/v1 will default to 10. So this is only really a problem for those of us that started with Kubernetes a while back.