Last week we did a run down of what you need for a production ready cluster. In this part 2 we’ll outline a checklist of best practices for applications running in Kubernetes.

If any of these topics interest you, and you are in London, we’re also running a workshop called “Production Ready Kubernetes”. Sign up here.

Application Checklist for Kubernetes

These are the areas that need attention before running your cluster in production.

What is itWhy you need itOptions
Readiness CheckEndpoints for Kubernetes to monitor your application lifecycle.Allows Kubernetes to restart or stop traffic to a pod.

Readiness failure is transient and tells Kubernetes to route traffic elsewhere

Readiness failure is useful for startup and load management
Read: Resilient apps with Liveness and Readiness probes.
Liveness checkEndpoints for Kubernetes to monitor your application lifecycle. Liveness failure is for telling Kubernetes to restart the pod.

Read: Resilient apps with Liveness and Readiness probes.

Metric instrumentationCode and libraries used in your code to expose metrics. Allows measuring operation of application and enables many more advanced use cases.Prometheus, New Relic, Datadog and others.

Read: Monitoring Kubernetes with Prometheus.
DashboardsView of metrics. You need to make sense out of the data. Grafana
Weave Cloud
Playbooks/RunbooksRich guides for your engineers on how-to operate the system and fault find when things go wrong. Nobody is at their sharpest at 03:00 AM.

Knowledge deteriorates over time
Markdown files
Weave Cloud Notebooks
Limits and requestsExplicit resource allocation for pods. Allows Kubernetes to make good scheduling decisions. Read: Kubernetes Pod Resource Limitations and Quality of Service.
Labels and annotationsMetadata held by Kubernetes. Makes workload management easier and allows other tools to work with standard Kubernetes definitions. Read: Labels and Selectors in Kubernetes.
AlertsAutomated notifications on defined trigger. You need to know when your service degrades. Prometheus & Alertmanager.

Read: Labels in Prometheus alerts: think twice before using them.
Structured logging outputOutput logs in a machine readable format to facilitate searching & indexing.Trace what went wrong when something does.ELK stack (Elasticsearch, Logstash and Kibana).

Many commercial offerings.
Tracing instrumentationInstrumentation to send request processing details to a collection service.Sometimes the only way of figuring out where latency is coming fromZipkin, Lightstep, Appdash, Tracer, Jaeger
Graceful shutdownsApplications respond to SIGTERM correctly.This is how Kubernetes will tell your application to end.Read: 10 tips for Building and Managing Containers
Graceful dependency (w. Readiness check)Applications don’t assume dependencies are available. Wait for other services before reporting ready.Avoid headaches that come with a service order requirement.Read: 10 tips for Building and Managing Containers
ConfigmapsDefine a configuration file for your application in Kubernetes using configmaps.
Easy to reconfigure an app without rebuilding, allows config to be versioned.Read: Best Practices for Designing and Building Containers for Kubernetes
Label the docker images with the code commit SHA.Makes tracing image to code trivial.Locked down runtime contextRead: Introduction to Kubernetes Security
Locked down runtime contextUse deliberately secure configuration for application runtime context.Reduces attack surface, makes privileges explicit.Read: Continuous Security for GitOps