Application Checklist for Kubernetes
Find out what the best practices are for running applications in Kubernetes with this convenient application checklist.
Cluster Ready Checklist for Kubernetes
Optimizing Kubernetes Resource Limits for Team Development
Optimizing Kubernetes Cluster Resource Limits
Aggregating Pod resource (CPU, memory) usage by arbitrary labels with Prometheus
Last week we did a run down of what you need for a production ready cluster. In this part 2 we’ll outline a checklist of best practices for applications running in Kubernetes.
If any of these topics interest you, and you are in London, we’re also running a workshop called “Production Ready Kubernetes”. Sign up here.
Application Checklist for Kubernetes
These are the areas that need attention before running your cluster in production.
|What is it||Why you need it||Options|
|Readiness Check||Endpoints for Kubernetes to monitor your application lifecycle.||Allows Kubernetes to restart or stop traffic to a pod. |
Readiness failure is transient and tells Kubernetes to route traffic elsewhere
Readiness failure is useful for startup and load management
|Read: Resilient apps with Liveness and Readiness probes.|
|Liveness check||Endpoints for Kubernetes to monitor your application lifecycle.||Liveness failure is for telling Kubernetes to restart the pod.|
|Metric instrumentation||Code and libraries used in your code to expose metrics.||Allows measuring operation of application and enables many more advanced use cases.||Prometheus, New Relic, Datadog and others.|
Read: Monitoring Kubernetes with Prometheus.
|Dashboards||View of metrics.||You need to make sense out of the data.||Grafana|
|Playbooks/Runbooks||Rich guides for your engineers on how-to operate the system and fault find when things go wrong.||Nobody is at their sharpest at 03:00 AM.|
Knowledge deteriorates over time
Weave Cloud Notebooks
|Limits and requests||Explicit resource allocation for pods.||Allows Kubernetes to make good scheduling decisions.||Read: Kubernetes Pod Resource Limitations and Quality of Service.|
|Labels and annotations||Metadata held by Kubernetes.||Makes workload management easier and allows other tools to work with standard Kubernetes definitions.||Read: Labels and Selectors in Kubernetes.|
|Alerts||Automated notifications on defined trigger.||You need to know when your service degrades.||Prometheus & Alertmanager.|
Read: Labels in Prometheus alerts: think twice before using them.
|Structured logging output||Output logs in a machine readable format to facilitate searching & indexing.||Trace what went wrong when something does.||ELK stack (Elasticsearch, Logstash and Kibana).|
Many commercial offerings.
|Tracing instrumentation||Instrumentation to send request processing details to a collection service.||Sometimes the only way of figuring out where latency is coming from||Zipkin, Lightstep, Appdash, Tracer, Jaeger|
|Graceful shutdowns||Applications respond to SIGTERM correctly.||This is how Kubernetes will tell your application to end.||Read: 10 tips for Building and Managing Containers|
|Graceful dependency (w. Readiness check)||Applications don’t assume dependencies are available. Wait for other services before reporting ready.||Avoid headaches that come with a service order requirement.||Read: 10 tips for Building and Managing Containers|
|Configmaps||Define a configuration file for your application in Kubernetes using configmaps.||Easy to reconfigure an app without rebuilding, allows config to be versioned.||Read: Best Practices for Designing and Building Containers for Kubernetes|
|Label the docker images with the code commit SHA.||Makes tracing image to code trivial.||Locked down runtime context||Read: Introduction to Kubernetes Security|
|Locked down runtime context||Use deliberately secure configuration for application runtime context.||Reduces attack surface, makes privileges explicit.||Read: Continuous Security for GitOps|