What is GitOps?
GitOps is a way to do Continuous Delivery. It works by using Git as a single source of truth for declarative infrastructure and applications.
With Git at the center of your delivery pipelines, every developer can make pull requests and use Git to accelerate and simplify both application deployments and operations tasks to Kubernetes. By using familiar tools like Git, developers are more productive and can focus their attention on creating new features rather than on operations tasks.
GitOps: versioned CI/CD on top of declarative infrastructure. Stop scripting and start shipping. https://t.co/SgUlHgNrnY
— Kelsey Hightower (@kelseyhightower) January 17, 2018
Key benefits of GitOps
With GitOps, automated delivery pipelines roll out changes to your infrastructure when changes are made to Git. But the idea of GitOps goes further than that – it uses tools to compare the actual production state of your whole application with what’s under source control and then it tells you when your cluster doesn’t match the real world.
By applying GitOps best practices, there is a ‘source of truth’ for both your infrastructure and application code. This allows development teams to increase their velocity and improve system reliability.
The benefits of applying GitOps best practices are far reaching and provide:
- a model for secure, cloud native CICD pipelines
- faster Mean Time to Deployment and Mean Time to Recovery
- stable and reproducible rollbacks (for example, revert/rollback/fork as per Git)
- an overall coherent approach to understanding, observing and managing apps when combined with modern monitoring & observability tools
GitOps is Continuous Delivery meets Cloud Native
GitOps uses ideas drawn from DevOps and Site Reliability Engineering, that started with Martin Fowler’s comprehensive Continuous Integration overview in 2006. In some sense GitOps iterates on those original concepts.
As a methodology for CI/CD pipelines GitOps has been described as the holy grail of development processes. Because there is no single tool that can do everything required in your pipeline, this gives you the freedom to choose the best tools for the different parts of the pipeline. You can select a set of tools from the open source ecosystem or from closed source or depending on your use case, you may even combine them. The most difficult part of creating a pipeline is glueing all of the pieces together.
Whatever you choose for your delivery pipeline, applying GitOps best practises with Git (or any version control) should be as an integral component of your process that will make building and adopting continuous delivery in your organization easier. This is true not only from a technical point of view but also from a cultural perspective.
#1. Everything that can be described must be stored in git
By using Git as the source of truth, it is possible to observe your cluster and compare it with the desired state. The goal is to describe everything: policies, code, configuration, and even monitored events and version control it all. Keeping everything under version control enforces convergence where changes can be reapplied if at first they didn’t succeed.
#2. Kubectl should not be used directly
As a general rule, it's not a good idea to deploy directly to the cluster using the command line utility kubectl. Many people let their CI tool drive deployment, but by doing that you’re potentially giving a notoriously hackable thing access to production.
#3. Use a Kubernetes controller that follows an operator pattern
With a Kubernetes controller that follows the operator pattern, your cluster always stays in sync with ‘the source of truth’ via its configuration files that are checked into Git. And since the desired state of your cluster is kept in Git, it can also be observed for differences against the running cluster.
- GitOps: Operations by Pull Request
- The GitOps Pipeline - Part 2
- GitOps: ‘Git Push’ all the things (The New Stack)
- The Best CI/CD Tool for Kubernetes Doesn’t Exist
Git enables infrastructure as code (IAC) tools
Kubernetes is just one example of many modern cloud native tools that are “declarative” and that can be treated as code. Declarative means that configuration is guaranteed by a set of facts instead of by a set of instructions, for example, “there are ten redis servers”, rather than “start ten redis servers, and tell me if it worked or not”.
With declarative tools, your entire set of configuration files can be version controlled in Git. By using Git as the source of truth your apps more easily deployed and rolled back to and from Kubernetes. And even more importantly, when disaster strikes, your cluster’s infrastructure can be dependably and quickly reproduced all from Git.
IAC tools vs. GitOps
Infrastructure as Code tools that can provision servers on demand have existed for quite some time. These tools originated the concept of keeping infrastructure config versioned, backed up and reproducible from source control.
But now with Kubernetes being almost completely declarative, combined with the immutable container, it is possible to extend some of these concepts to managing applications and their operating system as well.
The ability to manage and compare the current state of both your infrastructure, and your applications so that you can test, deploy, rollback, rollforward with a complete audit trail all from Git is what encompasses the GitOps philosophy and its best practices. This is possible because Kubernetes is managed almost entirely through declarative config and because containers are immutable.
At Weaveworks, we use Terraform and Ansible to provision servers. We also keep those configuration files backed up and versioned in Git. IAC tools and their associated configuration files form a central part of our GitOps workflows for near-instant cluster recovery if disaster strikes here at Weaveworks.
What if my system diverges from the source of truth?
Declarative provisioning tools let you describe your desired true state in Git. But they suffer from the problem that what is “really true right now” is in the live system, and that may differ from what is described in source control.
How do you know if the live system has converged to the desired state?
Can you get notified when this differs?
What is the “canary in the coal mine” that informs you when you’re in trouble?
How do you trigger convergence between the cluster and source control?
There is prior art here.
IAC tools like Chef, Puppet and Ansible support features like “diff alerts”. These help operators to understand when action may need to be taken to “converge” the live system to the intended state (as defined by the configuration scripts). And more recently, best practice is to deploy immutable images (eg. containers) so that divergence is less likely.
In the “GitOps” model, we use Git to solve for divergence and convergence, aided by a set of “diff” and “sync” tools (kubediff, as well as terradiff and ansiblediff) that compare the intended state with actual state.
GitOps builds on immutable infrastructure
GitOps takes full advantage of the move towards immutable infrastructure and declarative container orchestration. We manage multiple deployments a day at Weaveworks. In order to minimize the risk of change after a deployment, whether intended or by accident via “configuration drift” it is essential that we maintain a reproducible and reliable deployment process.
Our whole system’s desired state (aka “the source of truth”) is described in Git. We use containers for immutability as well as different cloud native tools like Terraform and Ansible to automate and manage our configuration. These tools together with containers and declarative nature of Kubernetes provide what we need for a complete recovery in the case of an entire meltdown.
- GitOps for Kubernetes: A DevOps Iteration Focused on Declarative Infrastructure
- Why we use Terraform and not Chef, Puppet, Ansible, SaltStack, or CloudFormation
Works in tandem with IAC tools
When you apply GitOps principles to “everything”, including machine configuration, applications and services in addition to alerting rules and dashboards, all are kept under source control.
Access to the running system should not be required except via Git. Any group of changes may be applied atomically, and diffed accordingly. The Git record is then not just an audit log but also a transaction log that you can use to roll back and forth to any snapshot.
Continuous Delivery and GitOps Workflows in Weave Cloud
Weave Cloud is designed specifically for version controlled systems and declarative application stacks. Every developer on your team is likely familiar with Git and can make pull requests. Now they can use Git to accelerate and simplify application deployments to Kubernetes as well.
Here is a typical developer workflow for creating or updating a new feature:
- A pull request for a new feature is pushed to GitHub for review.
- The code is reviewed and approved by a colleague. After the code is revised, and re-approved it is merged to Git.
- The Git merge triggers the CI and build pipeline, runs a series of tests and then eventually builds a new image and deposits to the new image to a registry.
- The Weave Cloud ‘Deployment Automator’ watches the image registry, notices the image, pulls the new image from the registry and updates its YAML in the config repo.
- The Weave Cloud ‘Deployment Synchronizer’ (installed to the cluster), detects that the cluster is out of date. It pulls the changed manifests from the config repo and deploys the new feature to production.
GitOps deployment pipeline:
Kubernetes controller implemented with the operator pattern
Weave Cloud implements a custom controller to listen for and synchronize deployments to your Kubernetes cluster. The controller is implemented using the operator pattern which is significant on two levels; first, it is more secure, and; secondly, it automates complex error prone tasks like having to manually update YAML manifests.
By using the operator pattern, an agent acts on behalf of the cluster to listen to events relating to custom resource changes, so that they can be applied. The agent is responsible for synchronizing what’s in Git with what’s running in the cluster and provides a simple way for your team to achieve continuous deployment.
- Comparing Kubernetes Operator Pattern with Alternatives
- Introducing Operators: Putting Operational Knowledge into Software
- CI/CD for Kubernetes: what you need to know
Pull vs Push Pipeline
Most CI/CD tools available today use a push-based model. A push-based pipeline means that code starts with the CI system and may continue its path through a series of encoded scripts or uses ‘kubectl’ by hand to push any changes to the Kubernetes cluster.
The reason you don’t want to use your CI system as the deployment impetus or do it manually on the command line is because of the potential to expose credentials outside of your cluster. While it is possible to secure both your CI/CD scripts and the command line, you are working outside the trust domain of your cluster. This is generally not good practice and is why CI systems can be known as attack vectors for production.
Typical push pipeline with read/write permission outside of the cluster:
In Weave Cloud images are pulled and credentials are kept inside the cluster:
Weave Cloud Pull Pipeline
Weave Cloud uses a pull strategy that consists of two key components: a “Deploy Automator” that watches the image registry and a “Deploy Synchronizer” that sits in the cluster to maintain its state.
At the centre of our pull pipeline pattern is a single source of truth for manifests (or a config repo). Developers push their updated code to the code base repository; where the change is picked up by the CI tool and ultimately builds a Docker image. The Weave Cloud ‘Deploy Automator’ notices the image, pulls the new image from the repository and then updates its YAML in the config repo. The Deploy Synchronizer, then detects that the cluster is out of date, and it pulls the changed manifests from the config repo and deploys the new image to the cluster.
Weave Cloud deployment agent installed to your cluster
With the Deployment synchronizer inside of the cluster, your cluster credentials are not exposed outside of your production environment. Once the Weave Cloud agents are installed to your cluster and your Git repo is connected, any changes in your production environment are done via Git pull requests with full rollbacks as well as convenient audit logs all provided by Git.
Observability as a deployment catalyst
With Kubernetes, GitOps can manage infrastructure and app deployments through pull-requests. But how do GitOps workflows and observability work together?
By combining GitOps workflows with real-time observability, your development team can make crucial decisions before they deploy any new features. Because about to be released services can be observed in real-time within the running cluster before you release, it means that you can deploy with confidence and deliver better quality features more quickly.
Observability can be seen as one of the principal drivers of the Continuous Delivery cycle for Kubernetes since it describes the actual running state of the system at any given time. The running system is observed in order to understand and control it. New features and fixes are pushed to git and trigger the deployment pipeline, and when ready to be released can be observed in real-time against the running cluster. At this point, the developer may return to the beginning of the pipeline based on this feedback or deploy and release the image to the production cluster.
GitOps is a release oriented model of both operations and features. How quickly you deliver new features to your customers, depends in part on how fast your team can go round the stages in this cycle.
Developers that use GitOps workflows and observability together need to answer these questions:
- If a change is released automatically how do we know it really worked?
- How can we be sure that our changes are actually driving improvement?
- In a complex distributed system how do we understand issues, diagnose them and handle incidents?
With Weave Cloud, observability workload dashboards are integrated into the deployment and release process. At a glance you can see right away if your deployment will be successful before you commit to releasing it to staging or production. This not only helps you identify problems faster but because observability workload dashboards are real-time and are built right into the deployment process, you can confidently deploy your service multiple times per day and be confident that the deployment is free from major defects.
Benefits of GitOps
By adopting GitOps best practices developers use familiar tools like Git to manage updates and features to Kubernetes more rapidly. By continuously pushing feature updates, businesses are more agile, can respond more quickly to customer demands, and are more competitive in the marketplace.
With GitOps you have a complete end to end pipeline. Not only are your continuous integrations and continuous deployment pipelines all driven by pull request, but your operations tasks are also fully reproducible through Git.
If you are using Weave Cloud, deployments to your running cluster are also made securely without leaking sensitive credentials outside of the cluster.
Stronger security guarantees
Git’s strong correctness and security guarantees, backed by the strong cryptography used to track and manage changes, as well as the ability to sign changes to prove authorship and origin is key to a correct and secure definition of desired state of the cluster. If a security breach does occur, the immutable and auditable source of truth can be used to recreate a new system independently of the compromised one, reducing downtime and allowing much better incident response.
Separation of responsibility between packaging software and releasing it to a production environment also embodies the security principle of least privilege, reducing the impact of compromise and providing a smaller attack surface.
Easier compliance and auditing
Since changes are tracked and logged in a secure manner, compliance and auditing are made trivial. The use of comparison tools like kubediff, terradiff and ansiblediff also allow you to compare a trusted definition of the state of the cluster with the actual running cluster, ensuring that the tracked and auditable changes match reality.
Weaveworks is the creator of Weave Cloud, a SaaS that simplifies deployment monitoring and management for containers and microservices. It extends and complements popular orchestrators, and allows developers and DevOps to make faster deployments, insightful monitoring, visualization and networking.
We use our own product to deploy and release new features to Weave Cloud. In addition to this, we are AWS & GCP technical partners; major contributors to the Kubernetes Open Source project; originators of the Kubernetes on AWS SIG; and also key members of the SIG Cluster Lifecycle.
For the past 3 years, Kubernetes has been powering Weave Cloud, our operations as a service offering, so we couldn’t be more excited to share our knowledge and help teams embrace the benefits of cloud native tooling and git-based workflows.
Contact us for more details on our Kubernetes support packages.