Running Consul on Kubernetes and Monitoring it With Prometheus

By Tom Wilkie
March 01, 2017

We run a couple of replicated Consul services for Weave Cloud: one acts as a coordinator for the distributed WebSocket router backing Scope’s terminals feature, and the other for storing the state of Cortex’s consistent hash ring. This...

Related posts

Cloud-Native Now Webinar: Containers Kubernetes Management

Kubernetes Security - A Complete Guide to Securing Your Containers

A Comprehensive Guide to Prometheus Monitoring

We run a couple of replicated Consul services for Weave Cloud: one acts as a coordinator for the distributed WebSocket router backing Scope’s terminals feature, and the other for storing the state of Cortex’s consistent hash ring. This blog post explores how we deploy and monitoring them.

What should you monitor?

We run Consul as a Kubernetes Deployment; as such, Pod identities could change when they restart, sometime we might have more than 3 replicas, and sometime less. The thing we really care about it is the Consul cluster thinks it’s healthy – which is normally equivalent to if it has elected a master.

Luckily Consul added an API for the status of the raft quorum in v0.7 last year, and it was relatively straightforward to add that to the Prometheus Consul exporter. We run this as a sidecar container in all our Consul pods. Writing an alerting rule for this was equally as easy:

<code>ALERT ConsulNoMaster
  IF          consul_raft_leader != 1
  FOR         1m
  LABELS      { severity="critical" }
  ANNOTATIONS {
    summary = "Consul has no master.",
    description = "Consul has no master.",
  }
</code>

One (often overlooked) important aspect of Prometheus’ alerting format is that this single alert definition will notify us when either of our Consul clusters are without a leader, and if we add more Consul clusters in the future, we don’t need to extend our alerting rules.

Multiple exporters per Pod

Those of us who are still awake at this point will notice the Prometheus Consul Exporter doesn’t give you metrics for things like QPS and error rates, and these are crucial parts of our monitoring philosophy at Weaveworks – the much vaunted RED method. Consul exports these kind of metrics in statsd format, so we also run the statsd exporter as a sidecar in all our Consul Pods – that’s right, two exporters per Pod. This wasn’t even supported by Prometheus until Fabien implemented the second generation Kubernetes service discovery (design doc), and was one of the motivating factors for us moving to scraping individual Pods, as opposed to service endpoints – the other motivation being scraping of “unready” Pods.

The following PromQL query give us some insight into Consul QPS, but we’ve not quite figured out latency yet:

<code>sum(irate(consul_consul_rpc_query_counter{job="scope/consul"}[1m]))
</code>

Getting this all to work with Atlas

A final topic of interest for running Consul on Kubernetes is how nodes in the cluster can find each other and bootstrap their initial quorum. We use a central service from Hashicorp, called Atlas, to do this for us. Unfortunately this service tends to remember old nodes that have gone away, and if this happens often enough – say three times – a bootstrapping consul cluster will fail to reach a quorum and elect a master. To detect this situation ahead of time we introduced the following alert:

<code>ALERT ConsulStalePeers
  IF          consul_raft_peers != 3
  FOR         1m
  LABELS      { severity="critical" }
  ANNOTATIONS {
    summary = "Consul has stale peer info.",
    description = "Consul has stale peer info.",
  }
</code>

An old failure mode of Cortex required you to delete the Consul cluster relatively frequently, which would pretty deterministically trigger this alert. To work around it, we added the follow config to Consul, telling Consul to remove itself from the raft peers when the process exits:

<code>"leave_on_terminate": true
</code>

The alert hasn’t fired since.

Alas, this discovery mechanism is going away, so we’ll need to find a new mechanism for bootstrap.


Thank you for reading our blog. We build Weave Cloud, which is a hosted add-on to your clusters. It helps you iterate faster on microservices with continuous delivery, visualization & debugging, and Prometheus monitoring to improve observability.

Try it out, join our online user group for free talks & trainings, and come and hang out with us on Slack.


Related posts

Cloud-Native Now Webinar: Containers Kubernetes Management

Kubernetes Security - A Complete Guide to Securing Your Containers

A Comprehensive Guide to Prometheus Monitoring