Culture, schmulture. DevOps, agile need to be software-first again
Decades of preaching about meatware complicated dev life
"The talks get a little repetitive, don't they?" she said as we were walking out of the elevator and through the lobby, escaping the latest two-day DevOpsDays nerd fest. Unable to resist the urge to mansplain, I meekly volunteered that most of the attendees are first-timers, so, you know, maybe it's new to them.
Upstairs someone had said they'd like to see more technical talks, and fewer, as they're called, "culture" talks. Of course, I hadn't attended any of the talks because, you know, a thought lord like myself goes to many of these and has seen "all the talks". Many years into DevOps, even I'm sick of all this culture stuff!
Everything was going well until the people showed up
This emphasis on "culture" is now well known to induce agenda and presentation nausea on the DevOps circuit. For example, the most fashionable architectural style of the moment starts with humans: one wants to do microservices to take advantage of how humans can't help but build systems that mimic how they organise themselves and, thus, communicate with one another. It's all people, the latest microservices deck-flipper will say.
And then there's handling failure: instead of (only) hardening systems so that they never fail, accept that they'll always fail, and rapidly learn from failure, even relishing and rewarding it. Failure is learning, comrade! This push to improve by failing brings about the "blameless postmortem", perhaps the most baffling concept for the sassy old-timers in the glasshouse.
In the tech industry, we're never really sure which is more important: the tool, or how people use the tool. There have always been at least two humans involved, the builders and the users. The builders are the ones who create the software: developers, designers, operators, QA staff, product managers. And, of course, there's the people who actually use the software, the users, sometimes called "the customer", especially when it comes to consumer tech.
The Hyborian age of computing
Before DevOps, way before the recorded time of the web, The Mythical Man-Month by Fred Brooks emphasised the best way to organise developers, namely in something analogous to surgical teams – a sort of great man theory. Getting the right builders in place was key to great software. Of course, much revived now, there was Conway's observation, drawn up into a "law" that (put slightly wrong) said software architecture will model the structure of the organisation that created it. Getting software to work well and do a job was something of a dancing bear for a long time: the quality of the bears dancing was not the axis of judgement, the fact that the bear could dance at all was the point!
In response to this, you saw a hoard of "usability" experts descend on the land. Here there were things like one-way mirrors, user interaction testing festooned with cameras recording the user's every move. It was expensive, and slow. And in most cases, the results seemed trivial: this button's text should be bigger; no one understands this error message; the configuration wizard should probably have less than 30 panels.
Nonetheless, the cat was out of the bag. The technology was now good enough that we could pay attention to how well actual users – humans – can use this software to get things done.
Things get extreme
Around this time, in the 1990s, early notions of agile software development formed. Any history of agile is fraught with a parade of agilesplainers with talk of Bohemian spirals, roses, and wikis. That's fine, and delightful over some snifters, but let's simplify it. In 1999, Kent Beck's eXtreme Programming Explained described a method that integrated the builders and users together in a novel, just-crazy-enough-to-work way. It crystallised while working on Chrysler's HR system, so it certainly had "enterprise" chops: this wasn't some pizza-based method for creating new Space Quest episodes. It was for real jobby-jobs!
One of eXtreme Programming Explained's core insights is that we have no idea what our software should actually do, and especially how it should be implemented, until we start trying. Rather than imaging the requirements a priori, it's only through an ongoing conversation with the user that we'll discover the right features. To do this, you would slice down the release window to something like a week incrementally co-innovating with the users, creating small pieces of functionality and asking them "whaddya make of that?" You'd conquer the unknown by shipping, and changing your approach as you learned more.
To do this, you had to do less each cycle, automate quality control with tests, and optimise the labour of the developers with pair programming. Even more bonkers, you'd pluck someone from "the business" – or even actual users! – to embed in the team to be the voice of reason and fight for the users, as they say.
These ideas ruffled the feathers of contemporary practitioners no end. They'd scoff and call agile people "cowboys" and other such derogatory grunts. "Agile" seemed bananas. Instead, people trusted their ability to predict what the software should do, confident that they could maximise requirements, fidelity and quality far beyond than those absurd, agile short release loops.
Converting cowboys to suburbanites
Nonetheless, as failures continued to rack up with this "big upfront" approach, people kept returning to those tales of success from deep in the Wild West of agile. With a few revs and splattering on some enterprise seasoning, the precepts of agile slowly became what everyone was doing. At least, what people claimed they were doing, ongoing surveys on agile practices actually in use continue to show slow adoption over 20 years later. Everyone's agile in spirit!
Early on, in 2001, the agile manifesto codified a mantle of principles, all wonderful sounding and terribly humane. For my money, the crowning achievement was the idea to value "responding to change over following a plan". In other words, as that hard-working, humble golden retriever put it: "I have no idea what I'm doing."
Among many competing agile thought-technologies, Scrum won out. There are many possible reasons why scrum was so widely and commercially successful: perhaps because of its highly structured nature, perhaps its training and certification system, and maybe because it actually worked! Many organisations still eagerly tell me how many certified scrum masters they have as a metric of how improved they are.
Customers are people too
There's an oft-forgotten milepost at this point, a strange little book called The Cluetrain Manifesto from 1999. The cadre of authors posited that the web was rapidly breaking down any geographic barriers and asymmetric strategies that enterprises used to retain and cajole customers. Things like reviews in Amazon and using eBay to find anything you wanted across the world broke down long-cherished strategic controls companies relied on to maintain market share. It was a sort of pulling back of the wool and empowering customers to be smarter than the octopus global-nationals, as we called them back them.
Cluetrain concepts were much toyed with throughout the 2000s, with companies investing much blood and treasure in capturing market share, ahead of monetising.
Paying attention to what people were doing with your software and improving the software to keep hold of their eyeballs longer was a popular business, and it still is. There's an ever-growing pool of revenue in never-ending conversation markets. Last quarter alone, Facebook earned $9.3bn in revenue with in $3.9bn profit.
In the land of eyeballs, the profitless win
Getting to those kind of eye-popping profits required new thinking when it came to both builders and users. As companies like Google, Netflix, Facebook, Amazon, and numerous others who lost to the buzzsaw of product/market fit built out their businesses, often their only success metrics were user growth and retention. They had to create exceptional software.
To do this, these companies competed on features, on the exceptionalism of their software. They had to start releasing software every week, if not every day to compete. As one of the Agile Manifesto principles put it: "Deliver working software frequently."
Of course, having the software actually work most of the time was important, as Twitter early on showed, somehow surviving, perhaps as the world's first example of the "move fast and break things" boast. Faced with the need to release software on demand, often daily, the enterprise approach of doing monolithic, gut-wrenching releases wasn't cutting it. The developers had to start thinking about and how their software was managing in production.
A common story from this era is the fateful day one of the programmers is selected to "run the servers". Shifting over to "ops", the programmer either goes mad, or starts doing what any competent programmer does when faced with a new problem: procrastinating and drinking. A few weeks later, they look at all that infrastructure as something to program, and start coding.
For me, a 2009 talk by Andrew Clay Shafer codified this thinking right around the time it was codified into DevOps. To a room full of agile lords and ladies, he proposed something wild and crazy: what if you were responsible for how your code ran in production? Perhaps you should start to understand, embrace, and improve that phase of your software's life.
This implied focusing on the people in the software development process and how they work together and behave. The people are just as much a part of the application as the software and the hardware.
The idea of a "blameless postmortem" is a good illustration: in innovation mode, things are going to break and go wrong as you charge into the unknown. Systems will go down catastrophically, but you can't simply give up, and punishing people just takes you back to the overly cautious state where software is released infrequently. So, as described by the Google SRE book, you instead celebrate failure, even telling the entire company the harrowing tales of what went wrong and, importantly, how you fixed it. Of course, once fixed, the key is understanding the problem well enough to put new policies, practices, and technology in place to prevent the problem from happening again.
As this type of navel-gazing continued, organisations once again discovered that most of the problems were caused by errors in the human systems they'd built, the meatware. Technology was an issue, to be sure, and there's a parallel story about how the evolution of what we now call "cloud" provided an ongoing arsenal for all this, with exciting distractions along the way with names like J2EE, rails, and WS-Deathstar.
People, though, were still the consistent problem. They just seemed to keep screwing up all this agile stuff, if they were actually doing it at all. Most still clung to the false comfort of big upfront planning and its illusionary promise of hitting The Date.
You'd see the effects of this backsliding in instances like the US's rollout of healthcare (saved by, ironically enough, a bunch of "cowboys" from out west). The private sector was, and is, no slouch at resisting agile either: they're just good at hiding it. The difference between them and the government is that enterprises can change more quickly when they're threatened. The "culture" at enterprises is more hopeful, perhaps, at least once backed up into a corner.
Just as the goofy social companies of the 2000s had to compete on innovation, large enterprises now feel the pinch from the numerous ankle-biting disruptors that are having a good go at eating the incumbent's lunch.
You see this reflected in executive comments in numerous quarterly calls. Some of them toss-up effortless word salads of "digital" and "omni-channel", but others have clearly considered their strategies and are applying a software-first approach to business. While they may not know exactly what to do, most executives know they need to start doing something. As JPMC's CEO said a few years back: "Silicon Valley is coming."
So that's where we are now: from Chrysler's HR system, to keeping Twitter up, streaming videos and sharing pictures of cats, to the very real need of old-school multinational, global enterprises to compete based on software.
Surveys show how shaken executives think the situation is, with many doubts that IT's not up to the task of transforming to the point where they can reliably create, refine, and run software. They know from experience that outsourcing doesn't work, so they're looking at their people, organisations, and technology. There are early indicators that it's working – tales of using this new software-defined business approach to insurance companies cutting the claims process from a week to less than a day and doubling the industry sales average – but there's a massive amount of work left. Hopefully, we won't back slide this time. ®
We'll be covering DevOps at our Continuous Lifecycle London 2018 event. Full details right here.