This post is a continuation of the series started here —
First, credit where credit is due. Open Threat Exchange would not have been possible without the hard work of many…
and references concepts explained here —
Deployment & Production Environment Management
The ability to incrementally evolve your production environment by allowing your developers to simply check in code…
“Do you know what repo that code is in?”
“I am just going to add it to this service, the build guys are all backed up and it will take too long to get a new service build created”
“I am just going to copy the old one and tweak it”
After we had gotten to about 10 or so micro-services supporting the back-end analytical systems of OTX we got to a point where the benefit was starting to wear thin. The effort of creating new services, managing the dependencies, and the inevitable ‘copy & tweak’ approach to build creation was creating substantial overhead in our process. Each developer was quickly reaching for their favorite service and shoving more functionality in it just to avoid the effort of creating a new service. Worse, as developers started working on each-others code there were entire repositories that were unknown to them making it near impossible to dig in and fix each other’s defects. Then there was the dreaded ‘build update’ when we decided there was a better way to do things and had to go through all the build configurations and update each one — every time forgetting at least one :).
These problems are probably familiar for anyone who has worked on a project leveraging micro-services at more than a nominal scale. Looking around there are a number of approaches that people have taken to try and address this — mostly at very very large organizations and mostly with either ‘universal repositories’ or very heavy processes. At the core we had a couple of major problems we needed to solve in a way that worked for our team.
This is what motivated the creation of the ‘service-buddy’ utility — an attempt to make it simple for developers to do the right thing, in a way that made it easy for their peers to keep up.
Service-buddy fundamentally does the following:
- Provides a centralized manifest & description of every microservice in your application
- Allows the team to define templates for each service to minimize effort to get started with a new service. This includes both a starter codebase as well as a corresponding build pipeline.
- Provides a mechanism for developers to discover services and ensure the full scope of the code base is available to them at development time
The intention of this tool is to make the ‘easy path’ the ‘right path.’ Our team and our culture was not going to be easily amenable to a heavy process or management enforced requirements for new service creation. To drive adoption and consistent use this tool needed to solve real problems for developers. The two problems at the core of our scaling pain was the creation and maintenance of new build pipelines as well as the discovery of code.
Service-buddy employs a concept of a ‘master’ repo — a single place where the manifest of all microservices exists which can then be used as the reference for bulk actions on the microservices or the entry point for the creation of new services.
The tool can then invoke actions on this master repo and perform the following actions:
- Reconcile the services defined in the manifest against the VCS and build system and create new code repositories and build pipelines for any newly created services. This also allows for the recreation or update of build pipelines for some of the supported build systems.
- Create clones of the the code repositories for each service on the local machine
- Perform bulk git actions against the local clones of every repo for developer productivity (think git pull to get the latest code for all 50 microservices…)
With the introduction of service-buddy we were able to reduce the effort of making a new service from a multi-day headache coordinating with multiple teams to a simple update to a JSON file and a code check-in to the master repository. The build for this master repository is performed by service-buddy which then creates the git repo with appropriate code template and then creates the appropriate build pipeline. For us in particular the larger benefit came with the consistency in our build process and ability to bulk update the pipelines for each service with modification to the template provided for the build pipeline. For example, when faced with changing the artifact repository we used for containers we could update the build pipeline template and then kick off the build process for the master repository which updated the builds for each microservice using containers as their artifacts. This was a major time-saver compared to hand modifying the pipelines for each microservices individually.
When considering such a tool the first few concerns are always about flexibility. This is definitely a learning curve for the team but ultimately a good logical test for the adoption of microservices. A major part of the benefit of a microservice is standardization. A core foundation that works well and a small amount of divergent code in each service is what provides agility and scale. As we worked through this we coalesced around a few code templates and a few build pipelines.
- Batch service — a microservice intended to accept a job and state-lessly execute
- API — a microservice intended to respond to a REST request.
- Single Page App (SPA) — a microservice intended to serve as the client side user interface
- Container — a docker container
- SPA — a zip of an angular app
By providing the building blocks in a templated form this provided a great ‘object oriented’ foundation for our services. Updates to the templates and build pipelines were applied automatically to the downstream users making iterative improvements to the monitoring, deployment, and testing an easy reality.
Ultimately, there is quite a bit you can build with these fundamental building blocks. The templates listed above were the ‘vessels’ for our custom business logic but this was limited in the context of our ultimate goal. Our intention was to treat every aspect of our production infrastructure as a microservice — from the load balancer to the database. To this end we also introduced additional services that represented the core infrastructure from the network layout, to the jump servers, to the infrastructure used to run our container-based services, and the datastores we used. This provided the foundation for our goal of a ‘read-only’ AWS UI — every aspect of our infrastructure was managed through a dedicated repository and every update to it was performed through a build pipeline. If you wanted to change a setting on the ECS Cluster it required a check in and a build documented in our build system — not to mention regression testing through our CI environment before the build system pushed to production. While full control over the infrastructure setting of our environment was a big win — ultimately the larger benefit was the flexibility this opened to our developers. Now we had a fully managed, self-service mechanism for the addition of new infrastructure into the production environment. Whether you wanted a new database or a new web api or a new batch processor you simply had to update a JSON file and check it in.
This is continued in a discussion on how to automate the management of cloud infrastructure at scale here —