Why?

No matter how well engineered your software is, how ready it is to be scaled both vertically and horizontally, once your application is out in the wild, you will need the tools and the means to keep a close eye on it to understand how it behaves under real load, to find the bottlenecks, N+1 query problems and to figure out the failing points when your system goes down.

The problem is that your system has so many moving parts that is difficult to make a decision on where to start or even yet what exactly to measure.

This problem becomes easier to tackle once you understand The RED Method.

Application Metrics

The TL;DR of this method is: for each of your services, gather the following information:

  • Rate: number of requests per second a.k.a. throughput
  • Errors: number of errors per second a.k.a. number of requests per second that returned a 5xx error
  • Duration: the amount of time that it takes for each request to be served

The RED Method encourages you to have a standard overview of your systems that enable you to not only analyze how the application is responding to user interaction, but will help you debug any issues that may come up in the future.. Not to mention that this approach will also help you analyze how your application’s behaviour has changed over time.

But even though these metrics are meant to be a good starting point, they are just that: a starting point. They will bring some insight into your stack but they will not tell you about the infrastructure where you run your software.

Cluster Metrics

A good way to complement these metrics is by also gathering the following metrics from your cluster:

  • Total amount of CPU available & being used
  • Total amount of RAM available & being used
  • Network traffic/throughput
  • Storage (disk I/O, disk usage, etc.)

In this way, you can correlate, for example, the amount of RAM that your application is consuming against the current amount of visitors that you have. This can also help you to detect memory leaks, determine if your disks are filling up or see if your infrastructure is being under-utilized.

Database Metrics

For SQL and NoSQL databases you will want to measure:

  • Current number of connections
  • Query duration
  • Number of queries per second

These metrics help detect slow running queries and possibly enable you to relate that with slow rendering pages on your site.

Extra Metrics

Some extra things that you want to consider adding to your metrics are:

  • 3rd party API’ rate, errors and duration. This can help you detect if the bottleneck in an endpoint at a certain point in time is either on your side or on the external API’s side of things
  • Job Queues. If you have a pool of background workers (for processing video, audio, images, bulk sending emails and tasks of that sort), measure the amount of jobs that you currently have in the queue. Bonus points if you can measure the amount of jobs processed since the last reading

How to monitor your infrastructure and applications?

There are several ways to get the metrics out of your system and many more vaults where you can store them, so those are two different things that we need to figure out.

Getting the metrics out of your system

There are a couple of ways to get the metrics out of your system:

The Push Method

  • You can push the metrics directly from your application into your data store
  • You can have an agent sitting right next to your application pulling the metrics out of it and then pushing them into the data store

The Pull Method

  • You can have an agent that remotely pulls the data out of your application and pushes the metrics into the data store

Special note on short-lived services: Some services are ephemeral in your infrastructure. For these kind of applications, the pull method might not be apt for this scenario and you will indeed need to push the metrics from the service itself.

Storing Your Metrics

For us to make a decision on what kind of storage to use for our metrics, it’s important to establish the criteria and nature of our data:

  1. The data that we’re going to be collecting is immutable. Take as an example metric the duration for each request on the endpoints of your API. You most likely will never update any of these records
  2. Since we are going to be collecting this data every constant amount of seconds, we can safely assume that for most of the dataset will already be sorted (by time!)

Kubernetes built-in

Kubernetes comes with a built-in endpoint for metrics which is supported by Prometheus.

When you deploy the Weave Cloud probes on your cluster, these will automatically gather all the metrics from Kubernetes and later on you will be able to query and graph them using Weave Monitor.

Sample Application

Kubernetes will be our base infrastructure and as we mentioned earlier, this is a good starting point for collecting metrics.

However, once you reach that point, we want to give you reference towards monitoring your actual application. For this purpose we are going to be using The Sock Shop, the microserices reference application.

Getting Started

What You Will Use

Requirements

Sign Up for Weave Cloud

Before you can use Weave Monitor to monitor apps, sign up for a Weave Cloud account.

1. Go to Weave Cloud

2. Sign up using either a Github, or Google account or use an email address.

3. Make a note of the cloud service token from the User settings screen:

Sign up for a Google Cloud Account

Creating a Google Cloud account is pretty straight forward. Once your account is created, make sure that you enable billing.

Create a Kubernetes cluster

In this tutorial we are going to be using the hosted Kubernetes version that Google Cloud offers since is the easiest way to get your own cluster up & running.

  1. Login into your Google Cloud account and find the Container Engine section.

    Click on the Create Container button and follow the instructions.

  2. When asked about the details of the cluster:

    • Pick a good name for your cluster
    • Select a Zone that is close to your physical location
    • We recommend a cluster of at least 7.5GB and at least 2 vCPUs for deploying The Sock Shop and the rest of the Weave probes.
    • Since this cluster is only for testing purposes, we recommend disabling
      • Automatic upgrades
      • Automatic repair

    When you are done filling up the form, click on the Create button

  3. Wait for your cluster to become available. This opeation might take between 5 and 10 minutes

  4. Once your cluster is ready, click on its Connect button. This will bring a dialogue up with the Google Cloud CLI configuration command

  5. You now need to authenticate with Google Cloud from the terminal Do so with the following command:

    gcloud auth login
    

    The output should be something like this:

    Go to the following link in your browser:
    
        https://accounts.google.com/o/oauth2/auth?redirect_uri=...
    
    
        Enter verification code:
    

    In your browser, open the link that was provided by the previous command and follow the instructions.

    Then we can get the credentials required by kubectl in order to authenticate with your Kubernetes cluster:

    gcloud container clusters get-credentials <cluster-name> \
                 --zone <zone> --project <project-id>
    

    Make sure that you replace the cluster-name, zone and project-id placeholders with the values that the Google Cloud is giving you.

    The output should be as follows:

    Fetching cluster endpoint and auth data.
    kubeconfig entry generated for cluster-1.
    

    Verify that you can actually talk to the cluster:

    kubectl cluster-info
    

    The output should yield something like this:

    Kubernetes master is running at https://1.2.3.4
    GLBCDefaultBackend is running at https://1.2.3.4/api/v1/proxy/namespaces/kube-system/services/default-http-backend
    Heapster is running at https://1.2.3.4/api/v1/proxy/namespaces/kube-system/services/heapster
    KubeDNS is running at https://1.2.3.4/api/v1/proxy/namespaces/kube-system/services/kube-dns
    kubernetes-dashboard is running at https://1.2.3.4/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
    
    To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
    

Deploy the Weave probes to Kubernetes

To deploy the Weave probes onto your GKE cluster please read the instructions from https://www.weave.works/docs/tutorials/kubernetes/cloud-on-gke/

Weave Monitor

At this point, the Weave Monitor probe has been deployed to your Kubernetes cluster. Monitor will automatically collect several metrics from your cluster, including:

  • Kubernetes built-in metrics
  • Any metrics generated by your services as long as your services expose metrics on the standard /metrics endpoint

To visualise these metrics, go to your Weave Cloud account and under the Monitor tab you will be presented with a list of preconfigured System Notebooks

You can check the Node Resources notebook

The Kubernetes notebook

And the Weave Net notebook

Notice that there are no metrics for this since GKE clusters use their own container networking plugin.

Deploy ‘The Sock Shop’ to Kubernetes

Let’s now deploy the microservices reference application, The Sock Shop, along with a load test service to generate some metrics that later on we will be able to visualise in Monitor.

To install The Sock Shop, run the following from within the container that you span up earlier:

git clone https://github.com/microservices-demo/microservices-demo microservices-demo
cd microservices-demo
kubectl create namespace sock-shop
kubectl apply -f deploy/kubernetes/manifests

It may take a few minutes before the application is completely ready and generating metrics that are collected by Monitor. You can go ahead to Weave Cloud, find the Monitor tab and create a new notebook for monitoring the Sock Shop. See the following image for a sample query that will show the request rate for each of the services in the Shop, both for HTTP 200 status codes as well as for HTTP 500 errors.

What next?

Up to here you have a source of truth to gauge how your system is behaving in production but this is only part of your job. You probably also need to attend meetings, write some code, do some server maintainance, eat, sleep, etc. For these specific scenarios you will need an alerting system that can look into the data from your metrics, analyse it and react on a certain criteria. For example, if in the past 5 minutes your front end application has been returning more HTTP 500 status codes than HTTP 200, then you probably want this alerting system to notify you about it.

For this we have written the Weave RED Alerts tutorial.

Join the Weave Community

If you have any questions or comments you can reach out to us on our Slack channel. To invite yourself to the Community Slack channel, visit Weave Community Slack invite or contact us through one of these other channels at Help and Support Services.