October 06, 2020
Living on the Edge - How Screenly Monitors Edge IoT Devices with Prometheus
In this guest post Viktor Petersson discusses how Screenly uses Prometheus to monitor their infrastructure. Over the years, they found Prometheus to be extremely versatile, and then expanded Prometheus to include business intelligence metrics.
September 12, 2019
How I Halved the Storage of Cortex
Find out how Bryan Boreham managed to cut the storage of Cortex’ time -series data in half by re-architecting how the data gets split into chunks.
March 21, 2019
Going Cloud Native: 6 essential things you need to know
Are you just starting on your digital transformation journey and still wondering what cloud native is, and why you need it? This new article discusses the key takeaways to know about the term cloud native.
February 21, 2019
How Does Prometheus Scale?
Cortex is an open source multi-tenant, horizontally scalable version of Prometheus. Bryan Boreham discusses how and why we built Cortex.
February 06, 2019
How Aspen Mesh Runs Cortex in Production
Neeraj Poddar Lead Platform Architect at Aspen Mesh gives us insights and tips on why and how they implemented Cortex in production.
January 10, 2019
Throwback Thursday: Configure notifications in Prometheus’ Alertmanager
Monitoring is crucial for any developer and on call engineer. But sorting through multiple notifications or missing critical information for a quick resolution is no fun. This blog post recaps a KubeCon lightning talk that demonstrates using labels in Prometheus alertmanager to only receive concise and easy to understand notifications.
October 04, 2018
Five Key Cloud Technologies for Kubernetes
Five key open source projects that help complete the Kubernetes feature set are discussed in this post: Prometheus, Istio, Helm, Weave Flux, and OpenFaaS.
September 19, 2018
Weaveworks Cortex - the newest member of the CNCF Sandbox
Cortex, our scalable Prometheus monitoring system, has been accepted as a Sandbox project by the Cloud Native Computing Foundation. Cortex is an open source project that we created to provide storage of Prometheus metrics for Weave Cloud. It's used by teams that are running large Prometheus environments, providing metrics for complex Kubernetes environments.
August 28, 2018
Labels in Prometheus alerts: think twice before using them
As developers, we hear a lot about the importance of monitoring and alerts. But without proper notification, we might spend too much time trying to understand what really is going on. This blog post will give you an overview of common caveats of using labels in Prometheus alerts and demonstrate some technics how to get concise and easy to understand notifications.
June 27, 2018
GitOps - What You Need To Know
Learn the principles and patterns of GitOps workflows and how to implement them to run Kubernetes in production and at scale. We added new content to our Kubernetes library, and summarized the key concepts of GitOps all in one place.
June 20, 2018
GitOps, Weave Cloud and EKS demonstrated at EKoSystem Day
Craig Wright demonstrated GitOps workflows and Weave Cloud on EKS at the EKoSystem Day event held at the AWS Loft in downtown San Francisco. Weaveworks was one of 10 technical partners invited to speak at this special event that was broadcasted live on Twitch.
March 01, 2018
Ensure High Availability and Uptime With Kubernetes HPA (Horizontal Pod Autoscaler) and Prometheus
Not all systems can meet their SLAs by relying on CPU/memory usage metrics alone, most web and mobile backends require autoscaling based on requests per second to handle any traffic bursts. This step by step guide shows you how to set up Kubernetes Horizontal Pod Autoscaler with Prometheus defined custom metrics, to fine tune your application monitoring and ensure high availability.
February 08, 2018
Monitoring Cloud-Native Applications
Understand the importance of monitoring your microservices and infrastructure, and how to turn those metrics into meaningful data when looking to improve performance or mitigate arising problems. Discover the different methodologies, metrics and approaches to effectively monitor microservices and the recommended tools to help you.
February 02, 2018
Architecture Overview: Cluster Monitoring at Scale on AWS
Watch this short architecture overview video to learn how Weaveworks monitors clusters at scale using a highly available, multi-tenant system built on AWS services.
October 30, 2017
A Practical Guide: From Instrumenting Code to Specifying Alerts with the RED Method
This practical guide will help you getting started with monitoring your microservices with Prometheus. We walk through selecting key metrics, instrumentation, setting up alerts and Grafana dashboards.
October 17, 2017
GitOps Part 3 - Observability
Observability can be seen as part of the Continuous Delivery cycle for Kubernetes. Observed state must be compared with the desired state in Git. The role of a GitOps dashboard is to enable observation and speed up understanding and validation of the system, and suggest mitigating actions. Monitoring alone does not answer all questions: metrics are symptoms but not the disease.
October 15, 2017
Swarmprom - Prometheus Monitoring for Docker Swarm
In this post we will be discussing how to set up application and infrastructure monitoring for Docker Swarm with the help of Prometheus. Swarmprom is a starter kit for Docker Swarm monitoring with Prometheus, Grafana, cAdvisor, Node Exporter, Alert Manager, and Unsee.
August 31, 2017
Monitoring, alert rules and loads of beers errr dashboards - PromCon 2017 in Munich
The Weaveworks team visited PromCon 2017 in Munich. One of our personal highlights was a deep dive into the past 12 months of Cortex, the basis of our monitoring and analytics capabilities in Weave Cloud.
August 14, 2017
Observability beyond logging for Java Microservices
Monitoring distributed applications is best approached using a combination of tools. Luke Marsden describes how Prometheus, openTracing and Weave Cloud visualization cover the bases to establish root cause of problems in distributed applications.
August 10, 2017
If something goes wrong in production, you want to immediately know the user impact. With that in mind, we created an automated alerting schema based on user-visible symptoms.