In our ongoing series on the most frequently asked questions from the Kubernetes community meetings, we are going to look at how to configure storage for bare metal installations. Much like the problems with defining ingress and routing traffic for bare metal, you obviously can’t rely on the convenient services that are available from the major cloud providers to provide persistent storage volumes for your stateful applications.

On the other hand, you don’t want to fall into the trap of having to look after your persistent volumes like pets. But let’s back up a little bit and explain why state can be a problem with Kubernetes and why you even need to consider storage management for your application.

Why is state so tricky?

Everyone working with Kubernetes knows that your containers should be stateless and immutable. But in reality we all know that there is really no such thing as a stateless architecture. If you want to do something useful with your applications, data needs to be stored somewhere, and be accessible by some services.

This means you need a solution that makes that data available after the Pod recovers. The basic idea behind storage management is to move the data outside of the Pod so that it can exist independently.

In Kubernetes, data is kept in a volume that allows the state of a service to persist across multiple pods. Refer to the Kubernetes documentation on Volumes where it is explained that disk files within a container are ephemeral unless they are abstracted through a volume.

emptydir

Kubernetes exposes multiple kinds of volumes. The most basic of which is the empty volume `emptydir`. With this type of volume, your node stores its data to an ‘emptydir` that runs from either RAM or from persistent storage like an SSD drive. This type of storage obviously runs right on the node and means that it only persists if the node is running. If the node goes down, the contents of the emptydir are erased.

The YAML for this type of definition (and any other volume definition for that matter) looks as follows:

apiVersion: v1
kind: Pod
metadata:
  name: test-pd
spec:
  containers:
  - image: k8s.gcr.io/test-webserver
    name: test-container
    volumeMounts:
    - mountPath: /cache
      name: cache-volume
  volumes:
  - name: cache-volume
    emptyDir: {}<o:p></o:p>

(from Kubernetes Docs: Volumes)

hostPath

If you don’t want your directory to start out empty, you can use a hostPath instead. The hostPath is connected to the node in essentially the same way through a YAML file where data is also stored to RAM or the SSD drive. The difference is that the host path is mounted directly on the Pod. This means if the Pod goes down, its data will still be preserved.

Public Cloud Volumes

If you are using one of the public clouds, you can take advantage of the many services such as awsElasticBlockStore or GCEPersistentDisk (or something similar) as your storage volumes. And most running Kubernetes in the public cloud would be doing it in this way. With most of these cloud volume services, all that is necessary is a YAML definition file that tells the Pod which provider and service to connect.

The problem though of connecting directly to the volume in this way is that developers must know the specific Volume ID and the NFS type before they can connect to it. This is a lot of low level detail that developers must keep track of and with a large development team this can create a bit of a management mess not to mention a possible security breach. This is where a Persistent Volume Claim (or PVC) comes in, which provides an abstraction layer on top of those details. But before we get to that, let’s first have a look at how you can use an NFS mount for your data.

NFS

NFS or the network file system is a UNIX protocol that allows you to mount any file system. The file system can be defined in a YAML file and then connected to and mounted as your volume.

If a Pod goes down or is removed, an NFS volume is simply unmounted, but the data is will still be available and unlike an emptydir it is not erased. However if you take a look at the NFS example in the documentation, it says you need to create a Persistent Volume Claim first and to not directly mount a volume with NFS.

Persistent Volume Claims

With a Persistent Volume Claim, the Pod can connect to volumes where ever they are through a series of abstractions. The abstractions can provide access to underlying cloud provided back-end storage volumes, or in the case of bare metal, on-prem storage volumes.

An advantage of doing it this way is that an Administrator can define the abstraction layer. This allows developers to obtain the volume ID and the NFS type through an API without actually having any of those details. This additional abstraction layer on top of the physical storage is a convenient way to separate Ops from Dev. Developers can instead use PVC to access the storage that they need while developing their services.

These are the parts to a persistent volume claim:

  • Persistent Volume Claim - a request for storage and mount it to a Pod dynamically without having to know the backend provider of the volume.
  • Persistent Volume - the specific volume being called as outlined in the claim as provisioned by an Administrator. These are not tied to a particular Pod and are managed by Kubernetes.
  • Storage class - allows dynamic storage allocation which is the preferred ‘self serve’ method for developers. Classes are defined by administrators.
  • Physical storage - the actual volume that is being connected to and mounted.

So how do I configure storage for bare metal Kubernetes?

Configuring local storage PVC for statefulsets went GA as of Kubernetes 1.10. Stefan Prodan (https://twitter.com/stefanprodan), Weaveworks community engineer has written this useful step by step tutorial on how to use kubeadm to spin up your cluster and then configuring it to use local SSD persistent volumes for statefulsets. The diagram below shows the basic architecture of this type of configuration on bare metal servers.


According to Stefan, using local SSD storage works well for HA capable databases like Mongo and ElasticSearch, and he further clarifies that “if your statefulsets are not HA you could also use Rook for your volumes.”

There are also many other third-party plugins that you can explore in the Kubernetes docs or take a look at this list of storage resources from mhausenblas

For additional information on stateful-applications, see “StatefulSet Basics Tutorial

Final Thoughts

In this post, I went over some of the problems with stateful applications running on Kubernetes. I then provided a brief sampling of the different types of volumes that are available and how they operate. Finally I described how by using Persistent Volume Claims you can more easily manage persistent data with a large development team.  And lastly, a link was provided to a tutorial written by Stefan Prodan that describes how to configure persistent volumes for bare metal servers running Kubernetes.

A note about ingress controllers for on-premise Kubernetes:
An astute reader pointed out to us that in our last Kubernetes FAQ on bare metal ingress, we completely forgot to include the mature project Træfik and talked about MetalLB instead! Thanks for pointing that out and here is a link to their docs on how to use it. 

As always we love to hear from you. If you have any comments or suggestions don't hesitate to contact us on slack or on twitter.

Need help?

For the past 3 years, Kubernetes has been powering Weave Cloud, our operations as a service offering. We’re happy to share our knowledge and help teams embrace the benefits of on-premise installations of Kubernetes. Contact us for more details on our Kubernetes support packages.