Throwback Thursday: Configure notifications in Prometheus’ Alertmanager

January 10, 2019

Monitoring is crucial for any developer and on call engineer. But sorting through multiple notifications or missing critical information for a quick resolution is no fun. This blog post recaps a KubeCon lightning talk that demonstrates using labels in Prometheus alertmanager to only receive concise and easy to understand notifications.

Related posts

Labels in Prometheus alerts: think twice before using them

Aggregating Pod resource (CPU, memory) usage by arbitrary labels with Prometheus

Monitoring Your Kubernetes Infrastructure with Prometheus

It almost feels like decades have past but KubeCon and CloudNativeCon North America have taken place just a short 4 weeks ago. So it’s time for a proper Throwback Thursday and look at one of the talk highlights from the show.

Elena’s lightning talk on Monday evening drew a huge crowd and a lot of social media love. In only 4 short minutes, Elena demonstrated  how to get concise and easy to understand alerts and notifications from Prometheus’ Alertmanager.

Let’s have a look behind the scenes and why we started spending time on the Prometheus’ Alertmanager.

In our SaaS product Weave Cloud we utilize a horizontally scalable, hosted monitoring service, based on Prometheus, to monitor Kubernetes clusters and applications. Weave Cloud aggregates metrics across a cluster and from all layers of the stack in a dynamic environment and allows the user to query it through an enhanced and easy to use interface. It allows developers and operators to understand the health and behavior of their applications at all times but especially before and after deployments.

Within Weave Cloud’s interface you can also set up and configures rules against metric thresholds that when met, route alerts to your preferred notification systems such as Pagerduty, OpsGenie, StackDriver or Slack.

Since we here at Weaveworks need to monitor our own service, we spend time on setting up and using alerting rules to define alert conditions. For our on call support engineers  it is greatly important to easily understand and be able to act immediately on the received notification. An alert that says, “Instance down” is not aiding in finding quick resolutions to the problem...

alert_instance_down.png

That is why we are suggesting using labels in alerting rules. The additional information can be attached to each alert and help pinpoint and identify a problem.

Watch Elena’s talk (slides are here) to see how she created and implemented an improved notification template. If you would like to follow her hands on tutorial, have a look at the blog post “Labels in Prometheus alerts: think twice before using them.



Related posts

Labels in Prometheus alerts: think twice before using them

Aggregating Pod resource (CPU, memory) usage by arbitrary labels with Prometheus

Monitoring Your Kubernetes Infrastructure with Prometheus

Download our latest whitepaper, "Monitoring Cloud Native Applications"