Kubernetes Observability: Log Aggregation Using ELK Stack

By Weaveworks
January 08, 2020

Logging, when used in the earliest design stages, helps diagnose bugs, gain insight into system behavior and spots potential issues before they occur."

Related posts

Kubernetes Security - A Complete Guide to Securing Your Containers

KubeCon EU 2023 Recap – GitOps Sessions on Flux with OCI, Liquid Metal CI/CD Platforms & Telco Cloud Platforms

Extending GitOps Beyond Kubernetes with Terraform Controller

Logging In The Cloud-Native Age

In the old days, all components of your infrastructure were well-defined and well-documented. For example, a typical web application could be hosted on a web server and a database server. Each component saved its own logs in a well-known location: /var/log/apache2/access.log, /var/log/apache2/error.log and mysql.log.

Back then, it was very easy to identify which logs belonged to which servers. In a highly complex environment, for example, you could have four web servers and two database engines, which are part of a cluster.

Let’s fast forward to the present day where terms like cloud providers, microservices architecture, containers, ephemeral environments, etc. are part of our everyday life. In an infrastructure that’s hosted on a container orchestration system like Kubernetes, how can you collect logs? The highly complex environment that we mentioned earlier could have dozens of pods for the frontend part, several for the middleware, and a number of StatefulSets. We need a central location where logs are saved, analyzed, and correlated. Since we’ll be having different types of logs from different sources, we need this system to be able to store them in a unified format that makes them easily searchable.

Logging Patterns In Kubernetes

Now that we have discussed how logging should be done in cloud-native environments, let’s have a look at the different patterns Kubernetes uses to generate logs.

The Quick Way To Obtain Logs

By default, any text that the pod outputs to the standard output STDOUT or the standard error STDERR can be viewed by the kubectl logs command. Consider the following pod definition:

apiVersion: v1
kind: Pod
metadata:
  name: counter
spec:
  containers:
  - name: count
    image: busybox
    args: [/bin/sh, -c,
      'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']

This pod uses the busybox image to print the current date and time every second indefinitely. Let’s apply this definition using kubectl apply -f pod.yml. Once the pod is running, we can grab its logs as follows:

$ kubectl logs counter
0: Sat Nov  2 08:46:40 UTC 2019
1: Sat Nov  2 08:46:41 UTC 2019
2: Sat Nov  2 08:46:42 UTC 2019
3: Sat Nov  2 08:46:43 UTC 2019

Aggregating Logs

The kubectl log command is useful when you want to quickly have a look at why a pod has failed, why it is behaving differently or whether or not it is doing what it is supposed to do. However, when you have several nodes with dozens or even hundreds of pods running on them, there should be a more efficient way to handle logs. There are a few log-aggregation systems available including the ELK stack that can be used for storing large amounts of log data in a standardized format. A log aggregation system uses a push mechanism to collect the data. This means that there must be an agent installed on the source entities that collects and sends the log data to the central server. For ELK stack, there are several agents that can do this job including Filebeat, Logstash, and fluentd. If you are installing Kubernetes on a cloud provider like GCP, the fluentd agent is already deployed in the installation process. For GCP, fluentd is already configured to send logs to Stackdriver. However, you can easily change the configuration to send the logs to a different target.

To abide by this pattern, Kubernetes offers two ways out of the three available:

Using a Daemonset: a Daemonset ensures that a specific pod is always running on all the cluster nodes. This pod runs the agent image (for example, fluentd) and is responsible for sending the logs from the node to the central server. By default, Kubernetes redirects all the container logs to a unified location. The daemonset pod collects logs from this location.

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack1.jpg

Using a Sidecar: a Sidecar is a term used to refer to containers running on the same pod as the application container. Due to the way pods work, the sidecar container has access to the same volume and share the same network interface with the other container. A sidecar container can send the logs either by pulling them from the application (like through an API endpoint designed for that purpose) or by scanning and parsing the log files that the application stores (remember, they are sharing the same storage).

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack2.jpg

Using the application logic: this does not need any Kubernetes support. You can simply design the application so that it sends logs periodically to the central log server. However, this is not a recommended approach because the application would be tightly coupled to its log server. You will have to generate your logs in the specific format that the server accepts. If you decide to switch to another server, you will have to modify the application code. On the other hand, if you hand the log collection, parsing, and pushing to the sidecar container, you only need to change the sidecar image when choosing a different log server. The application container remains intact.

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack3.jpg

What Is The ELK Stack?

The ELK stack is a popular log aggregation and visualization solution that is maintained by elasticsearch. The word “ELK” is an abbreviation for the following components:

ElasticSearch: this is where the data gets stored.

Logstash: the program responsible for transforming logs to a format that is suitable for being stored in the ElasticSearch database.

Kibana: where you can communicate with the Elasticsearch API, run complex queries and visualize them to get more insight into the data. You can also use Kibana to set and send alerts when a threshold is crossed. For example, you can get notified when the number of 5xx errors in Apache logs exceeds a certain limit.

LAB: Collecting And Aggregating Logs In A Cloud-Native Environment Using Kubernetes And The ELK Stack

In this lab, we will demonstrate how we can use a combination of Kubernetes for container orchestration and the ELK stack for log collection and analysis with a sample web application. For this lab, you will need admin access to a running Kubernetes cluster and the kubectl tool installed and configured for that cluster.

Installing Elasticsearch

We start by installing the Elasticsearch component. We are going to create a service account to be used by the component. We don’t want to give it admin access, it only needs read access to services, namespaces, and endpoints. Let’s start by creating the necessary resources to activate this account: the service account, the cluster role, and the cluster role binding:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: elasticsearch-logging
  labels:
    k8s-app: elasticsearch-logging
rules:
- apiGroups:
  - ""
  resources:
  - "services"
  - "namespaces"
  - "endpoints"
  verbs:
  - "get"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: kube-system
  name: elasticsearch-logging
  labels:
    k8s-app: elasticsearch-logging
subjects:
- kind: ServiceAccount
  name: elasticsearch-logging
  namespace: kube-system
  apiGroup: ""
roleRef:
  kind: ClusterRole
  name: elasticsearch-logging
  apiGroup: ""

Save this definition to a file and apply it. For example:

$ kubectl apply -f rbac.yml
serviceaccount/elasticsearch-logging created
clusterrole.rbac.authorization.k8s.io/elasticsearch-logging created
clusterrolebinding.rbac.authorization.k8s.io/elasticsearch-logging created

Next, we need to deploy the actual Elasticsearch cluster. We use a Statefulset for this purpose because we need elasticsearch to have well-defined hostnames, network and storage. Our Statefulset definition may look as follows:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
spec:
  serviceName: elasticsearch-logging
  replicas: 2
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      k8s-app: elasticsearch-logging
  template:
    metadata:
      labels:
        k8s-app: elasticsearch-logging
    spec:
      serviceAccountName: elasticsearch-logging
      containers:
      - image: elasticsearch:6.8.4
        name: elasticsearch-logging
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        volumeMounts:
        - name: elasticsearch-logging
          mountPath: /data
        env:
        - name: "NAMESPACE"
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
      volumes:
      - name: elasticsearch-logging
        emptyDir: {}
      initContainers:
      - image: alpine:3.6
        command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
        name: elasticsearch-logging-init
        securityContext:
          privileged: true

I’m going to discuss the important parts of this definition only. 

Lines 35-39: We inject the namespace that Elasticsearch is using through an environment variable, NAMESPACE. We are using the downward API to grab the name of the current namespace.

Lines 40-42: we are using the emptyDir volume type. In a real scenario, you may want to use persistent volumes.

Lines 43-48: Notice that Elasticsearch requires that you set the vm.max_map_count Linux kernel parameter to be at least 262144. So, we use an init container that sets this parameter for us before the application starts. Setting kernel parameters requires that the container has root privilege and access to modify kernel parameters. So, we set the privileged parameter to true.

The last part we need here is the Service through which we can access the Elasticsearch databases. Add the following to a YAML file and apply it:

apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
spec:
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    k8s-app: elasticsearch-logging

Notice that we didn’t specify any means for external access through this Service. The Service type is clusterIP which means that it is accessible only from within the cluster.

Let’s apply that last definition to create the service. You should now be able to view the default welcome message of Elasticsearch by using port forwarding as follows:

kubectl port-forward -n kube-system svc/elasticsearch-logging 9200:9200

Now, you can use curl or just open your browser and navigate to localhost:9200. For example, curl localhost:9200

Installing Logstash

Logstash acts as an adapter that receives the raw logs, and formats it in a way that Elasticsearch understands. The tricky part about Logstash lies in its configuration. The rest is just a deployment that mounts the configuration file as a configMap and a Service that exposes Logstash to other cluster pods. So, let’s spend a few minutes with the configMap. Create a new file called logstash-config.yml and add the following lines to it:

apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash-configmap
  namespace: kube-system
data:
  logstash.yml: |
    http.host: "0.0.0.0"
    path.config: /usr/share/logstash/pipeline
  logstash.conf: |
    input {
      beats {
        port => 5044
      }
    }
    filter {
      grok {
          match => { "message" => "%{COMBINEDAPACHELOG}" }
      }
      date {
        match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
      }
      geoip {
          source => "clientip"
        }
      }
      output {
        elasticsearch {
          hosts => ["elasticsearch-logging:9200"]
      }
    }

The configMap contains two files: logstash.yml and logstash.conf. The first file has just two lines: it defines the network address on which Logstash will listen, we specified 0.0.0.0 to denote that it needs to listen on all available interfaces. The second line specified where Logstash should find its configuration file which is /usr/share/logstash/pipeline. This configuration path is where the second file (logstash.conf) resides. That second file is what instructs Logstash about how to parse the incoming log files. Let’s have a look at the interesting parts of this file:

  • The input stanza instructs Logstash as to where it should get its data. The daemon will be listening at port 5044 and an agent (Filebeat in our case) will push logs to this port.
  • The filter stanza is where we specify how logs should be interpreted. Logstash uses filters to parse and transform log files to a format understandable by Elasticsearch. In our example, we are using grok. Explaining how the Grok filter works is beyond the scope of this article but you can read more about it here. We are using one of the options available for Grok out of the box, which is used for parsing Apache logs in the combined format (COMBINEDAPACHELOG). Since it’s a popular log format, grok can automatically extract key information from each line and convert it to JSON format.
  • The date stanza is used for adding a timestamp to each logline. You can use the timestamp to configure exactly how the timestamp would appear.
  • The geoip part is used to add the client’s IP address to the log so that we know where it is coming from.
  • The output part defines the target, where Logstash should forward the parsed log data. In our lab, we want Logstash to forward it to the Elasticsearch cluster. We specify the service name without the need to add the namespace and the rest of the URL (like in elasticsearch-logging.kube-system.svc.cluster.local) because both resources are in the same namespace.

Let’s apply this configMap and create the necessary deployment.

Create a new file called logstash-deployment.yml and add the following lines to it:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: logstash-deployment
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: logstash
    spec:
      containers:
      - name: logstash
        image: docker.elastic.co/logstash/logstash:6.3.0
        ports:
        - containerPort: 5044
        volumeMounts:
          - name: config-volume
            mountPath: /usr/share/logstash/config
          - name: logstash-pipeline-volume
            mountPath: /usr/share/logstash/pipeline
      volumes:
      - name: config-volume
        configMap:
          name: logstash-configmap
          items:
            - key: logstash.yml
              path: logstash.yml
      - name: logstash-pipeline-volume
        configMap:
          name: logstash-configmap
          items:
            - key: logstash.conf
              path: logstash.conf

The deployment uses the configMap we created earlier, the official Logstash image, and declares that it should be reached on port 5044. The last resource we need here is the Service that will make this pod reachable. Create a new file called logstash-service.yml and add the following lines to it:

kind: Service
apiVersion: v1
metadata:
  name: logstash-service
  namespace: kube-system
spec:
  selector:
    app: logstash
  ports:
  - protocol: TCP
    port: 5044
    targetPort: 5044

Installing Filebeat Agent

Filebeat is the agent that we are going to use to ship logs to Logstash. We are using a DaemonSet for this deployment. A DaemonSet ensures that an instance of the Pod is running each node in the cluster. To deploy Filebeat, we need to create a service account, a cluster role, and a cluster role binding the same way we did with Elasticsearch. We also need a configMap to hold the instructions that Filebeat would use to ship logs. I’ve combined all the required resources in one definition file that we’ll discuss:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - watch
  - list
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.config:
      prospectors:
        # Mounted `filebeat-prospectors` configmap:
        path: ${path.config}/prospectors.d/*.yml
        # Reload prospectors configs as they change:
        reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false
    output.logstash:
      hosts: ['logstash-service:5044']
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-prospectors
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  kubernetes.yml: |-
    - type: docker
      containers.ids:
      - "*"
      processors:
        - add_kubernetes_metadata:
            in_cluster: true
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  template:
    metadata:
      labels:
        k8s-app: filebeat
  
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      containers:
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:6.8.4
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        securityContext:
          runAsUser: 0
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: prospectors
          mountPath: /usr/share/filebeat/prospectors.d
          readOnly: true
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      volumes:
      - name: config
        configMap:
          defaultMode: 0600
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: prospectors
        configMap:
          defaultMode: 0600
          name: filebeat-prospectors
      - name: data
        emptyDir: {}

Quite a long file but it’s easier than it looks. Again, we’ll discuss the important parts:

Lines 1-36: We create the necessary service account, cluster role and cluster role binding with read-only access to the resources of interest (pods and namespaces).

Lines 57,58: in the configMap that holds the Filebeat configuration, we specify that it needs to ship the log data to Logstash. We’re specifying the Service short URL since both resources live in the same namespace.

Lines 119-121: Among the mounted filesystems that Filebeat will have access to, we are specifying /var/lib/docker/containers. Notice that the volume type of this path is hostPath (line 120), which means that Filebeat will have access to this path on the node rather than the container. Kubernetes uses this path on the node to write data about the containers, additionally, any STDOUT or STDERR coming from the containers running on the node is directed to this path in JSON format (the standard output and standard error data is still viewable through the kubectl logs command, but a copy is kept at that path).

Apply the above definition file to the cluster. Now the last part remaining in the stack is the visualization window, Kibana. Let’s deploy it.

Installing Kibana

Kibana is just the UI through which you can execute simple and complex queries against the Elasticsearch database. Kibana needs to know the URL through which it can reach Elasticsearch, we’ll add this through an environment variable. No further configuration is needed (as far as this lab is setup) so we are not using a configMap. The definition file for Kibana may look as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: kibana-logging
  template:
    metadata:
      labels:
        k8s-app: kibana-logging
    spec:
      containers:
      - name: kibana-logging
        image: docker.elastic.co/kibana/kibana-oss:6.8.4
        env:
          - name: ELASTICSEARCH_URL
            value: http://elasticsearch-logging:9200
        ports:
        - containerPort: 5601
          name: ui
          protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/name: "Kibana"
spec:
  type: NodePort
  ports:
  - port: 5601
    protocol: TCP
    targetPort: ui
    nodePort: 32010
  selector:
    k8s-app: kibana-logging

Let’s have a look at the interesting parts of this definition:

Lines 22,23: we’re specifying the Elasticsearch URL. As usual, we’re putting the short form of the service URL.

Lines 38,43: the Service we are creating needs to have external exposure so that we can log in and view the logs. So, we are using the NodePort Service type and specifying 32010 as the port number. This port is accessible for any node in your cluster. Note that - depending on your underlying infrastructure or the cloud provider hosting - you may need to enable this port on the firewall.

Apply the above definition to the cluster, wait for a few moments for the pod to get deployed and navigate to http://node_port:32010. You should see Kibana’s dashboard, click on “Skip” to avoid Kibana adding sample data. Now, click on Discover on the left panel. You should see something similar to the following:

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack4.png

Type logstash* as the index pattern. This instructs Kibana to query Elasticsearch’s indices that match this pattern. Click “Next step”.

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack5.jpg

Select @timestamp as the Time Filter field name and click “Create index pattern”.

The index will get created in a few seconds. Now click on “Discover”. You should see something like the following:

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack6.jpg

That’s a lot of data! The reason is that Filebeat is shipping all the log data that the node is generating about the containers running inside it. Let’s make things more interesting by deploying a sample web server and demonstrating how we can grab its logs collectively from multiple pods.

Deploying A Sample Application: Apache Webserver

Applications should be designed so that they log their output and error messages to STDOUT and STDERR. As mentioned earlier, Docker (and Kubernetes in clustered environments) automatically keep a copy of those logs on the node, so that agents like Filebeat can ship them together with the node logs. The Apache image (httpd) follows this logging pattern so we’ll deploy it as a sample application. The following definition file contains the Deployment and Service resources necessary to bring the webserver up on multiple pods:

Apply this definition to your cluster. Now, let’s test and see if the webserver is running, and make a few requests to generate some log data. First, we need to use port-forwarding as this webserver is not publicly exposed:

kubectl -n default port-forward svc/webserver 8080:80

If you open the browser and navigate to localhost:8080, you should find the famous “It works!” message. Refresh the page a few times to increase the probability of having different pods responding to your requests.

Testing The Workflow

By now you should have five components running on your cluster: Apache, Filebeat, Logstash, Elasticsearch, and Kibana. In the previous step, you made a few requests to the web server, let’s see how we can track this in Kibana.

Open the dashboard and make sure you cover at least the past hour as shown:

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack7.jpg

If you click on “Add a filter” on the left, you can see a lot of possible tags that you can use to select the log messages that we are interested in:

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack8.jpg

Since our web server pod had the label app=web, we can select that in our filters as shown:

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack9.jpg


Of course, your output may differ but it should be close to the following:

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack10.jpg


Note that the graph is displaying the number of times the log message matches our filter (that it is coming from the resource with label app=web) and when.

The message tag is displaying the exact logline that was output by Apache. But this is not very useful as we can always get the same output using the kubectl log command. The real power of the ELK stack comes from the ability to aggregate several logs from different sources. So, for example, we can count all the 404 errors that occurred in the last hour on all pods that serve our application, even a specific pod. Let’s test that.

In your browser, generate several requests to http://localhost:8080/notfound. We do not have a file in our web directory called not found, so Apache will simply generate the appropriate 404 messages that the requested resource was not found on the server.

On Kibana, and while you still have the previous filter set, add the following filter:

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack11.jpg

By clicking Save, you are applying this filter on the data that you have. You should see something similar to the following:

Kubernetes_Observability-_Log_Aggregation_Using_ELK_Stack12.jpg

The graph displays the number of 404 messages that occurred and their time of occurrence. The logline has the message itself specifying which file was requested and was not found. You also have additional data that you can use for narrowing down the selection even further like the node name, the container name, and the pod name.

About ELK Stack Components Compatibility

If you notice, we used the same minor and major version numbers when deploying the ELK stack components, so that all of them could be versioned 6.8.4. This is intentional as the ELK stack components will work with each other as long you follow the compatibility matrix. You are strongly encouraged to review the Elastic Support Matrix document before attempting to deploy the ELK stack in your environment.

Production Environment Security Considerations

We wanted this lab to be as simple as possible so we ignored additional levels of configuration that would have distracted the reader from the core concepts that we wanted to deliver. However, in production environments, you should consider the following uncovered topics:

  • The Elasticsearch - as of the time of this writing, there are no authentication mechanisms yet. So, you may want to add a reverse proxy that implements basic authentication to protect the cluster (even if it is not publicly exposed).
  • Kibana has its own methods of authentication. So, you can use that or you can add another reverse proxy server in front of it with basic authentication.
  • Once authentication is enabled, different services may need credentials to contact each other. Those credentials should be stored in Secrets.
  • In our lab, we used the NodePort service type to expose our Kibana service publicly. Using NodePort has its own shortcomings because node failure detection needs to be implemented on the client-side. A better solution is to use a Load Balancer or an Ingress controller.
  • You should implement SSL on any publicly accessible endpoint. If you are using the cloud-provider’s Load Balancer, this might have already been covered in the service. You can also use your own certificates preferentially.

TL;DR

  • Logging has always been a top priority that should be taken care of in the earliest design stages. Using logging, you can not only diagnose bugs, gain insight into how the system is behaving but you can also use it to spot potential issues before they occur.
  • In non-cloud-native environments, logging was not much of an issue because each component had a well-defined location. For example, the webserver is hosted on machine A, the application is on machine B and the database is on machine C. You could easily collect and identify which logs are coming from which source.
  • In microservices-dominated environments, logging and log-collection should be done differently. You’re no longer aware of which specific node/pod responded to which web request. Maybe some requests are failing on a specific pod but are responded to normally on another. For that reason, we use log aggregation systems like ELK stack.
  • ELK is an open-source project maintained by Elastic.co. it consists of three components: Elasticsearch database, Logstash adapter, and Kibana UI.
  • We can easily deploy the ELK stack on Kubernetes by using a StatefulSet for the deployment, a configMap for holding the necessary configurations and the required service account, cluster role and cluster role binding.
  • ELK stack works by receiving log data from different sources. For the source to send its logs, it needs an agent. There are many agents that can do this role like Logstash, Fluentd, and Filebeat.
  • Once the data is stored in Elasticsearch, you can use Kibana to run queries against the database. Through visualizations, you can gain observability into different aspects of the running application. For example, in the lab, we were able to determine the frequency of 404 error responses regardless of the node, pod, or container they originated from.

Related posts

Kubernetes Security - A Complete Guide to Securing Your Containers

KubeCon EU 2023 Recap – GitOps Sessions on Flux with OCI, Liquid Metal CI/CD Platforms & Telco Cloud Platforms

Extending GitOps Beyond Kubernetes with Terraform Controller

Whitepaper: Production Ready Checklists for Kubernetes

Download these comprehensive checklists to help determine your internal readiness and gain an understanding of the areas you should update

Download your Copy