Environment Variables Configuration Pattern
Wanting to learn about various Kubernetes patterns? Check out our blog post regarding the environment variables configuration pattern.
Each and every application out there needs external configuration at some point. For example, the error verbosity level of a PHP script, the output format of a Python Flask API (XML or JSON), the cookie name of a generic web application e.t.c.
It’s best practice not to hardcode configure settings in the application code as this requires redeploying (or even rebuilding) the solution whenever a variable needed a change. With the introduction of containerization and cloud-native apps, externalizing configuration variables has become of more importance. “There’s always more than one way to skin a cat” as the proverb goes. In this article, we will discuss the best practices of injecting configuration data into a containerized application. The container orchestration system we are going to use in this demonstration is Kubernetes.
Using Environment Variables
One way of injecting configuration data into a container is through environment variables. Every modern operating system supports storing key-value pairs (variables) and availing them to any running application. All programming languages have a way to retrieve environment variables from the OS.
You’d normally set some sane defaults in your application that can be overridden “if needed” through an environment variable. Consider the following Python script:
import os
slack_channel = os.getenv("SLACK_CHANNEL","#general")
print(slack_channel)
Here we are making use of the os module to get the environment variable SLACK_CHANNEL. If this variable was not defined, we default to #general. The Dockerfile for this application may look like this:
FROM jfloff/alpine-python
ENV SLACK_CHANNEL="#mychannel"
COPY ./app.py /app.py
ENTRYPOINT [ "python","/app.py" ]
Notice that the ENV parameter in the Dockerfile overrides the default value specified in the Python script. So, if we run this container we receive “#mychannel” as follows:
docker build -t envar .
Sending build context to Docker daemon 3.072kB
Step 1/4 : FROM jfloff/alpine-python
---> 26f2afad7d1b
Step 2/4 : ENV SLACK_CHANNEL="#mychannel"
---> Using cache
---> 7fb49becc956
Step 3/4 : COPY ./app.py /app.py
---> Using cache
---> ddcb0a013b67
Step 4/4 : ENTRYPOINT [ "python","/app.py" ]
---> Using cache
---> eaa1915dcdf5
Successfully built eaa1915dcdf5
Successfully tagged envar:latest
$ docker run envar
#mychannel
Now, provided if you don’t have access to the Dockerfile of this image, Or you simply don’t want to rebuild the whole image for just changing the Slack channel name, the option that You have is to override the variables using the command line option -e as follows:
$ docker run -e "SLACK_CHANNEL=#newchannel" envar
#newchannel
Kubernetes Env, ConfigMaps, and Secrets
Kubernetes follows the same approach as Docker in enabling you to inject environment variables at runtime. However, it takes the mechanism one step further to allow for more flexibility. So, if you want to specify environment variables when running a Pod, you can use:
- The env stanza: the most basic form. Just add a key and a value.
- configMaps: allows you to inject multiple values at once from a file. You can either avail the variables through environment variables or mount the whole configuration file through a volume.
- Secrets: it works in a similar manner as configMaps. However, they’re more suited to sensitive data like passwords, API keys, etc.
Consider the following configMap definition:
kind: ConfigMap
apiVersion: v1
metadata:
name: appconfig
data:
channel: mychannel
slackConfig: |
{"webhook":"https://hooks.slack.com/services/T00/B000/XXX"}
In the above definition, we created two variables. Notice that you can use the | sign to add more complex content like JSON objects (the | sign is a YAML specification). Now, we can create a Pod that makes use of this configMap as follows:
kind: Pod
apiVersion: v1
metadata:
name: app
spec:
containers:
- name: app
image: bash
env:
- name: LOG_LEVEL
value: info
command: ["bash","-c","sleep 1000000"]
envFrom:
- configMapRef:
name: appconfig
Once the Pod is running, you can double-check that we have our variables available:
$ kubectl exec -it app -- bash
bash-5.0# env
--- output truncated ---
LOG_LEVEL=info
channel=mychannel
slackConfig={"webhook":"https://hooks.slack.com/services/T00/B000/XXX"}
--- output truncated ---
In this example, we made use of the env stanza and the configMaps to inject different variables into the container’s environment.
In general, you should use env if you have a few simple variables that can be tightly-coupled with the Pod definition. For example, the log verbosity level of your application.
On the other hand, configMaps are more suited to complex configurations. For example, you can load php.ini or package.json files into a configMap and inject them into the container. In this specific case, it’d be better to expose the files as volumes instead of environment variables. For more details on how to use configMaps, please refer to the official Kubernetes documentation.
Alternatively, you should use Secrets whenever you need to inject sensitive information into the container. For example, database passwords, private SSH keys, and certificates should all go into Secrets instead of configMaps. Like configMaps variables, Secret variables can be exposed to the container through environment variables or mounted volumes. However, for security reasons, you should always use the mounted volumes option when using Secrets. For more information about how to configure and use Kubernetes Secrets, please refer to our article Kubernetes Secrets 101.
Should I Supply “default” Configuration Values?
Providing same defaults for all your environment variables may sound like a good idea at first, but think of an application with dozens of options that need to be set before it starts.
Forcing your users to fill in each and every variable is a heavy burden. Additionally, not all the variables may be known to users. Sometimes, you’re aware of the existence of a variable only when you need to change it. Otherwise, you may not have an idea of which value should this setting hold.
However, setting default values for all your variables may not be your best option. As a rule of thumb, you should not set default values for variables that are likely to change. For example, the default username for an external API is likely to change. Modifying a default setting can be a costly process. It requires redeploying or even rebuilding the application. It’s better to enforce the user to supply the credentials when using the application and break it early enough when the value is missing.
When Not to Use Environment Variables
Environment variables are common in many places. You can use them on the programming language level, the OS, the container, and the Pod. As such, it becomes hard to find where an environment variable is coming from if it was defined in different places. Questions like which level overrides which one, and where to set the variable so that it takes effect adds to the problem.
Another potential disadvantage of using environment variables is that they must be set before the application starts. Once that application is running, the only way to change an already-set environment variable is by restarting the container. I call this drawback “potential” because, in some scenarios, this is the desired behavior. Some people prefer to tightly-couple the container with their variables, making the deployment immutable and an immutable deployment cannot be altered except by dropping the existing containers and creating new ones with the new settings. Needless to say, a Deployment controller with a rolling updates option defined is a perfect candidate here. This pattern ensures that you’re always aware of the “current” configuration in place.
TL;DR
- It has always been a best practice to keep an application configuration outside the code, even before containerization gains traction.
- With Docker and Kubernetes, you can inject configuration data through environment variables.
- Kubernetes uses ConfigMaps to avail environment variables to Pods and their containers. It uses Secrets in the same way but for sensitive information, environment variables should never be used for storing critical data.
- You should supply default values for environment variables as long as they’re not vulnerable to frequent changes.
- Environment variables are fine on a small scale. When you have more complex configuration scenarios, you should opt for other means of data injection like mounted volumes.
*The outline of this article outline is inspired by the book of Roland Huss and Bilgin Ibryam : Kubernetes Patterns.