Acquia Engineering is excited to be open-sourcing Statsgod, a reimplementation of StatsD we created internally to help scale our metrics collection effort.
Acquia developers often create tooling to build, deploy, and monitor applications we run on Amazon Web Services, and Statsgod is one such tool that we want to make publicly available. Statsgod was designed to be highly scalable and easily deployed.
Features Added
StatsD was originally created by Etsy and we have used it in production for years. However, we found that as our metrics systems grew, StatsD was starting to become a bottleneck. We needed to capture and aggregate metrics locally for high performance services, such as computing Varnish stats in real time for tens of thousands of websites doing billions of pageviews monthly. Our solution was to connect Varnish to statsgod over a local socket to avoid network latency, and the bottleneck of a single StatsD instance.
We chose to write statsgod using Go for its speed and its great standard library of networking and concurrency primitives. Since Go compiles to native binaries, deployments became easy and self-contained. We could rely on Go for great performance.
For example, we used channels to keep the client responses as fast as possible. When a client writes to the socket, we can read the stream, put the data onto a channel, and close the socket immediately. Thus, a client is not penalized for the time required for complex parsing and aggregation. Aggregation is handled later by a Go routine that pulls all of the metrics from the channel.
We added a few other features as we went. Statsgod can listen on TCP, UDP or Unix Domain Sockets. It allows for token-based authentication. We also added a lot of testing. Statsgod has 100 percent test coverage using clean abstraction and mocking. The repo includes an end-to-end soak test to send millions of metrics, and verify that they were all received and correctly parsed.
Dependency Management
When starting work on Statsgod we wanted to be able to pin Go package versions and have specialized groups depending on the environment. We decided to use the Gom package manager to handle this. The ‘Gomfile’ syntax is very similar to that of Ruby Bundler, allowing us to work in a familiar workflow. It also allows us to use specific development and production groups thanks to a pull request we submitted back to the project.
Deployment
Because Go builds native binaries, deployment is much easier than with the Node.js-powered StatsD. Statsgod includes Debian packaging to make things very straightforward; it can also be built inside a Docker scratch container, which helps reduce the complexity of building a release.
Go Profiling
While developing Statsgod, we put the application through rigorous testing to find bottlenecks. Using the runtime.pprof library and the end-to-end tests, we found several inefficiencies. One of the worst offenders was the strings package. By converting to byte buffers, we were able to reduce about 75 percent of the workload. Below is an example of the profiling graph (click on the image to see a larger version). See the profiling documentation for more information.
Future of Go at Acquia
Statsgod was our first Go initiative within Acquia. Since then our developers have embraced Go for a variety of projects. We find that Go is a fit for any application requiring concurrency and lower-level networking. We have also found that Go is very testable, providing a platform for writing safe, robust code. Over the last year Go has quickly risen to a first-class language for Acquia engineering teams, joining PHP, Ruby, and Java as a preferred language for developers.
Statsgod contributors:
(BTW, Acquia is hiring for a number of positions, including Cloud Software Engineer.)