Back in September 2016 I gave a talk at the Microservices SF meetup entitled “Microservices: Lessons Learned”. This post is a brief write up of the talk, including the video and slides. It’s much more opinionated that previous posts – whilst all these opinions are my own, I can’t take credit for any originality…
The talk started with a discussion about what we were trying achieve: significantly higher developer productivity. I want to do with 20 people what it used to take 200 people to do.
Optimise for MTTR
This one shouldn’t be very controversial. Optimising for Mean Time To Recovery (as oppose to Mean Time Between Failure) is really the admission that most problems we’ve had were operator error, where a highly available system potentially wouldn’t have helped. This is not to say you shouldn’t do HA, but rather you prioritise MTTR first, to minimise downtime in the event of operator error.
So what does this mean in practice? It means building systems in such a way that you are more-or-less guaranteed to be able to rebuild them, from source, in short amount of time, under pressure, when things go sideways.
More concretely, it means everything must be version controlled. Not just code, but all the config, all the scripts for bring up infrastructure etc. It means always being able to deploy what’s on HEAD. And more than that, you need constantly running automated tests and alerts to ensure that the config in version controlled matches what is running (see Kubediff, Ansiblediff, Terradiff etc).
Emergent Microservices Architecture
This opinion is potentially more controversial. The idea of emergent architecture isn’t new, but I found relatively little written about it. As I understand it, this is the idea that you shouldn’t set out to build a “master vision” for your software architecture on day zero, but rather solve the problems right in front of you and iterate until the architecture “emerges”. It’s the realisation that you’ll never know the problems or requirements ahead of time, and that instead you’ll learn a lot more by making mistakes and fixing them as you go. In a lot of ways it just the restatement of some of the Agile principles.
In my opinion this idea fits with a microservices oriented architecture nicely, as the concept of microservices encourages you to build small, isolated domains of responsibility (services), behind well defined APIs, which can easily be modified or replaced. If you keeps the emergent architecture philosophy in mind, it should be encouraging you to build composable software components. I also like the idea of building software that can be easily replaced, instead of extended.
The talk walks through the evolution of the Weave Cloud (neé Scope service) architecture, from a set of Scope pets, through the multitenant architecture to the multi-service architecture we now have – and the emergence of the common PaaS-like layer.
- Wikipedia page on “Emergent Design”
- This is heavily related to the “Worse-is-better” approach – this post by Adrian Colyer gives a great overview here.
- This HBR piece isn’t too bad either
The Microservice Trap
One slide I am particularly fond of, but I guess didn’t really fit in the narrative very well, was trying to illustrate the downsides of microservices and in particular the problems with the multitenant Scope architecture. I used this as an example of “over-disaggregating” services:
With this diagram I’m trying to show that the while OSS Scope can satisfy all queries using in-memory data, without crossing any process boundaries, the multitenant Scope has to cross multiple process boundaries to satisfy a query, and marshal/unmarshal these hilariously large reports at every stage. Still, it keeps your DC warm.
DevOps and Continuous Delivery
So much has been written about these two subjects that I don’t feel like I have anything extra to add. Watch the video if you want to know what I said…
Thank you for reading our blog. We build Weave Cloud, which is a hosted add-on to your clusters. It helps you iterate faster on microservices with continuous delivery, visualization & debugging, and Prometheus monitoring to improve observability.