When I first started at Sauce, one group within the Engineering organization was called *Dev, referred to in conversation as "StarDev." *Dev was so named because they were the wildcard development group - some of their work touched on this part of the infrastructure, or this part of the product, or this part of the Web interface, and so on. As a result of this interdisciplinary approach, much of the product and operational development fell under their purview, but it was difficult to tell exactly what it was they were responsible for. What was even more difficult was that when it came time to undertake actual projects, there was no clear delineation of what role *Dev needed to play in their development.
When you see the emergence of a group like *Dev, it's a clear sign that your engineering organization has reached an evolutionary stage common to startups. That's the stage where you go from a handful of people sitting in a room together hacking out code, to trying to formalize responsibility for parts of the product while also engaging other stakeholders within the company, like Product Management. The tendency in these moments is to think in terms of silos - this group is responsible for operations, this one for the front-end interface, this one for backend infrastructure, this one for testing, and so on. The problem with this approach is that it almost always results in organizational mutations like *Dev. because the nature of today's software and technologies doesn't recognize such neat distributions of responsibility. This is especially true of PaaS, IaaS, and other cloud-based offerings, that are themselves hybrid service/product offerings. I saw this phenomena occur when I took the reins at Lynda.com. Although the team had grown to more than 30 engineers, they could only work on one project at a time. The whole team "swarmed" on a given feature, since nobody owned any one piece of the service. The first change that I made was to organize the team into smaller scrum teams that each owned a feature from end to end. This simple reorganization unblocked the log jam that had been created by the swarm mentality, and allowed teams to work on multiple projects at once.
When I first became aware of the existence of *Dev at Sauce Labs, I knew that the company was ready to take the next step in its evolution. In its interdisciplinary nature you could see that there was a clear recognition of how to approach the development of our offerings, but the organizational concept was too rooted in traditional, hierarchical managerial structures. What we needed was to get back to those small handfuls of people who could focus on specific projects, but have some method for guiding ourselves to specific goals, and make sure we we were on track.
This is the point at which introducing an Agile Scrum approach to development is the critical catalyst for the transformation from Engineering to DevOps. DevOps, like *Dev, is essentially interdisciplinary, but needs the project-oriented approach to getting things done that Scrum provides to keep it from losing its organizational coherence. By thinking of our work in terms of Epics that describe ongoing areas of activity, which in turn are made up of smaller Stories that are, in effect, composed by the teams that work on them, we have a conceptual framework for our activity that embraces the interdisciplinary demands of DevOps. By then tackling these stories in two-week sprints, we are able to define the specific tasks that are required to accomplish our goals, and measure our progress against them. While this may not immediately achieve "continuous development" or even "continuous integration", it's only by breaking down the tendency toward silos and monolithic organizational structures that we can then start to think about the changes in tools, systems, and processes that would be necessary to achieve those states.
We introduced Scrum at Sauce in the first quarter of 2016, and since then a few things have happened.
Code quality improved dramatically. We reorganized into small scrum teams, each responsible for one area of the service. In the past, engineers were responsible for multiple areas of the code, and were always working on multiple projects, thus often losing focus and lacking ownership of code. Scrum teams were given ownership of a single part of the codebase, and were able to focus on one thing at a time. It was also made clear that teams were absolutely responsible for the quality of their code. All of this combined to create a sense of "pride of ownership". And as we all know, people always treat their own cars MUCH better than rental cars. The result here was the same - exponentially better quality.
We discovered a lot of problems! At first, this was startling to the team. But I assured them that this was normal, and exactly what we wanted. These problems were not new, they had always been there, but had been swept under the rug, masked. Since everyone was so frenetic and unfocused prior to scrum, we hadn't noticed all these issues. The clarity that scrum gave us about these issues forced us to face them, rather that ignore them. The key things that we found were around technical debt. We put all of these into our team backlogs and started to figure out how to tackle them. We didn't like what we saw, but at least we knew what the reality was.
We realized how thinly spread we were. Before scrum, all the developers were incredibly busy, but we were not getting things done at the pace that we expected. When we broke out into scrum teams, each with a distinct responsibility, we saw what the problem was. We were trying to do way too much given our capacity. Putting all desired new functionality, enhancements, bug fixes, infrastructure work, security improvements, and technical debt into backlogs showed us that demand was significantly higher than supply. This forced us to prioritize the work, and to pare down work to the most important features. That forced us to listen even closer to our customers so that we knew what was most important to them. And finally, it helped us decide what level of capacity we wanted so we could make better hiring decisions.
The move to scrum created a much more satisfied team, all around. The DevOps team was happy because they now had clearer focus on on their responsibilities. I made it clear that we had to set a sustainable pace, because this is not a sprint (no scrum pun intended!), nor is it even a marathon, this is something that we should be able to do indefinitely. Burned out engineers only make things worse. And, finally the product and sales teams were happier because they saw that things were getting done, albeit in smaller increments. Overall, everyone felt that we were getting unstuck, and bringing clarity to our work that removed the "wildcard" factor that had previously dominated.
We have a long way to go on this journey to DevOps, but I think we're off to a great start.
Joe Alfaro is VP of Engineering at Sauce Labs. This is the second post in a series dedicated to chronicling our journey to transform Sauce Labs from Engineering to DevOps. Start from the beginning and read the first post here, or read the next installment in the series.