DX at the Department of Defense: Platform One and GitOps
The Department of Defense’s software development efforts span across government staff and private sector contractors and runs at 100k big. That’s a massive operation to run; one that has to be run in an extremely secure way. And yet, the DoD strives to run it in a modern way too. This post explores how the DoD wants to provide their teams the best developer experience possible without hindering any security measure and have decided to do it through the DoD DevSecOps Initiative and Platform One
There’s an initiative and there’s a platform. Nothing strange in that sentence. The self service platform wants to make 85% of the developer experience (DX) common and interchangeable across all teams. OK. The initiative aims to make the whole development process, both the part provided by the platform plus the custom capabilities built on top, fully secure. Mhhm OK. That, in itself is a challenging goal but a pretty bland description of a common scenario across the software industry: improving developer velocity. However, if I told you that the initiative tries to secure the biggest weapon arsenal in the world and the platform is meant to serve the biggest software development company in the world with over 100,000 developers, then things acquire a different perspective. The text above now reads dramatically differently, and so, the stakes are high. Quite high. Sky high maybe.
The Department of Defense’s software development efforts span across government staff and private sector contractors and runs at around 100k big. Yes, that’s 1 and five 0s. That’s a massive operation to run; one that, as you can imagine, has to be run in an extremely secure way. And yet, the DoD strives to run it in a modern way too, the same way the best cloud native players out there are delivering value through software at scale. The DoD wants to provide their teams the best developer experience possible without hindering any security measure and have decided to do it through the DoD DevSecOps Initiative and Platform One.
DevSecOps is a security-first approach to DevOps, which encourages greater collaboration and pace of operations without compromising on security. This initiative sits naturally with a federal agency like the DoD.
According to the DoD, Platform One is "a collection of approved, hardened Cloud Native Computing Foundation (CNCF)-compliant Kubernetes distributions, infrastructure as code playbooks, and hardened containers." In simple words, it is a collection of readymade templates and packages that make available resources from multiple cloud vendors such as AWS, Azure, Red Hat OpenShift, VMware PKS, Rancher, and more. Rather than each team creating their own stack from scratch and doing it differently, Platform One gives any of the 150 product teams in the DoD access to readily available cloud resources. Platform One cuts short the time it takes to stand up an application stack from days or weeks to hours or minutes. It does this with the guarantees of DevSecOps built-in.
Bringing private sector velocity to the secure USAF pipeline
There are a number of skills and capabilities that both the DevSecOps initiative and Platform One aim to foster among their users when building mission critical apps for their warfighters. This includes meeting requirements fast, awarding contracts in days, and testing out new features and fixing bugs in the remotest environments. But chiefly, they want developers to feel empowered. They want to do things from the bottom up, rather than top down. They know that like when engaging in combat, speed is critical to successfully delivering high quality software.
When designing Platform One to provide that experience, GitOps was the operational model that would clearly provide such speeds with precise guardrails and compliance. According to Nicolas Chaillan, Chief Software Officer US Air Force and Co-Lead , DoD Enterprise DevSecOps Initiative:
GitOps was and is the key for our success in building and rolling out Platform One across the entire DoD.
GitOps is a Git-centric approach to building and managing software systems. It requires that every part of the system be defined declaratively in Git repositories. A GitOps agent like Flux then reads this configuration and applies it to the system in production. The agent is also capable of alerting whenever production diverges from the declared state in Git. This is a powerful way to build and deliver an internal developer platform like Platform One as it is easy to replicate and fosters consistency in operations.
With the correct operational model, Platform One can be packaged and delivered to any new software factory, established or on the field, using the public cloud, on premises instances or even air-gapped environments, and have them run mission-critical applications fast.
So why would GitOps be the perfect blueprint to build Platform One from?
Chaillan was kind enough to visit our podcast, The Art of Modern Ops, earlier this year to explain how PlatformONE actually works. But in essence, making DevSecOps somehow seamless for developers is a fantastic idea. A platform built around the GitOps principles and cloud-native technologies, is one that has baked in security pipelines. It also has enough separation of concerns between CI and CD to make it feasible and manageable to secure deployments. If developer velocity in the shape of features and software delivery can be abstracted from the complexities of managing extremely compliant environments, then a platform like Platform One has done its job to enable government innovation.
Safety and security are non-negotiables but we also want developer self service to boost productivity and velocity. - Nicolas Chaillan, DoD
As one can imagine, the military have all sorts of environments that range from on premises to air-gapped, low compute power etc. And Platform One has to not only be able to run there to deliver the capabilities built on it but also it has to be updated and that’s when GitOps allows PlatformONE maintainers to define the whole system in a Git server, even in a confined environment, push the latest declared version of the system and have the GitOps runtime update all the embedded systems running PlatforONE.
Weaveworks and the CNCF delivered a transformative open source tool with Flux; managing Kubernetes deployments at scale is now easy and fast. - Nicolas Chaillan, DoD
This is Big Bang and with it a team can have a complete CI pipeline with a Git server to push code to, a build process, static and dynamic code analysis (SAST & DAST), and a place to host the built artifact to have Flux run it in the already spun Kubernetes cluster.
How Platform One is making progress
To demonstrate return on investment and effectiveness, Platform One collects DevOps Research and Assessments (DORA) metrics and other data points, like Deployment Frequency (DF), Mean Lead Time for changes (MLT), Mean Time To Recover (MTTR), and Change Failure Rate (CFR). Creating a baseline and tracking metrics are required for any value stream improvement process and USAF software delivery is no exception.
Why are we measuring? What are our goals? The answer to those questions according to Platform One engineers are: To provide engineers with enough focus on what is improvable in the context like shrinking batch size and story size, improving throughput, track defect rates, and aim at continuous delivery. But also, collectively aggregating these metrics without contextual input, allows the DoD to improve in the broader goals of the DevSecOps initiative and Platform One.
Just like Weave GitOps enables self service platforms for developers that implement easily the GitOps model in a Kubernetes cluster, Platform One is enabling the biggest software company in the world self improve and deliver mission critical capabilities at the speed of light.