Amazon’s Away Teams laid bare: How AWS's hivemind of engineers develop and maintain their internal tech
Cloud giant's structure, staff practices revealed
The properties of Amazon’s service-oriented collaboration
Amazon’s principles for collaboration are differentiating and avoid some of the problems that routinely come up in other large organizations:
Months of begging may be required to get access to another team’s source code.
- A feature that radiates power by generating revenue with customers or enabling control of key decisions will be kept by a team rather than passed to the natural team.
- Getting the attention of management to make decisions about refactoring roadmaps may take months. Often such attention does not lead to true collaboration.
- A team may delay providing help to another until incentives are adjusted.
- It is not uncommon for local optimization of the team roadmap to take precedence over work that might be transformational for the business.
With Amazon’s system of service-oriented collaboration, many of these problems would never happen, and some would happen much less. It would be naive to say that any organization the size of Amazon was free of politics, but these principles tend to encourage alignment as does the genuinely held value of customer obsession.
Here are some of the positive properties of Amazon’s system that flow from the principles:
- The Away Team model and generally easy access to source code means that investment can easily cross service boundaries to enhance the power of the entire system of services. Teams with a vision for making their own service more powerful by improving other services are free to execute.
- Management time needed to resolve collaboration issues and refactor roadmaps is dramatically reduced. In Amazon’s model, roadmaps need to be rebalanced at times, but those events are minimized because successful services fund the Away Teams.
- Team autonomy also reduces the need for management input. The policy is to do whatever it takes to provide value to the customer and not worry about duplication or deviating from standards. There is no waiting while the perfect shared service is being developed. There is no friction from having to use the perfect shared service.
- The lack of transfer pricing means that teams are generally focused on making better services, not on keeping track of funny money. Resource usage is tracked by the finance team who looks for spikes and requests explanations or optimizations.
- The emphasis on data reduces ideological passion. I may be right or you may be right, but we don’t need to fight it out. We will see in the end because the data is always right.
- The beneficial effects of cross team scrutiny are built in. Your code will be visible. Your decisions may be subject to bar raisers. The code an Away Team writes must be accepted by another team. If you are going to be sloppy, it will become public.
- Over time, public AWS versions of services tend to be preferred because they are often higher quality and performance than internal services because of the customer obsession of Amazon, and more usage accelerates the improvement process.
The downside of Amazon’s model is that the architecture may contain duplication or many versions of the same service. A common motto is “two is better than none,” meaning do what you need to do. Another balancing motto is “none is better than five,” so don’t get carried away and create a new service each time something doesn’t exactly fit.
The larger worry is that product cohesion and consistency may suffer. This writer hasn’t yet spent time digging up differences among API patterns across AWS and I won’t be able to do so against internal APIs, but we are told they are there.
Of course, the view we’ve presented represents the theory and not ugly defects that can crop up in practice. As recounted in this blog by someone who was frustrated by Amazon’s system and quit after five months, the Away Team model, the bar raisers, and the standard development environment all can go wrong and cause problems. Amazon is a high pressure, competitive environment, which can be hard on staff and managers as postings to sites like Glassdoor attest.
“Business moves fast. We must move faster” is a chestnut saying often attributed to Bezos. The service-oriented collaboration model is optimized for speed both in the short and long term, using Kurzweil’s exponential lever and avoiding Von Hippel’s sticky information. In the end, Amazon is happy to use service-oriented collaboration to ensure AWS and internal services grow rapidly, however awkward the shape. ®