Seven years ago, Steve Souders and Jesse Robbins came to the realization that they both worked within “tribes” that, while ostensibly quite different, were talking about many of the same things. Front-end developers and engineers were figuring out how to make web pages faster and more reliable, and web operations folks were making deployments faster and more resilient.
And so goes the tale of how Velocity came to be — a conference that brought those tribes together and openly shared how to make the web faster and stronger. In those seven years, quite a lot has changed, and many ideas, terms, and technologies have come into being — some directly as a result of Velocity, others were already in the works. DevOps, Chef, Puppet, Continuous Delivery, HTTP Archive — these were the earlier forays. Soon to follow were AWS, Application Performance Monitoring (APM) products, many more monitoring tools, many more CDN vendors, WebPageTest, the explosion of the cloud, Chaos Monkey, mobile everything, Vagrant, Docker, and much, much more.
Out of the fire of Velocity came a New Way of doing things forged in a web-centric world. Along the way, something changed fundamentally about not just tech companies, but companies in general. As we looked around more recently, we realized it wasn’t just about the web and fast pages any more.
“Nordstrom went from optimizing for IT cost to optimizing for delivery speed.” — Courtney Kissler, VP of ecommerce and store technologies
IT is driving speed to market now — it is no longer merely a cost center. The companies that figured this out, notably companies like Netflix or Nordstrom, are thriving. Many more companies that didn’t have technology as their initial or primary focus are scrambling to hire Chief Digital Officers while CIOs sweat whether they’ll be relevant much longer. Open source software, new tooling, and the cloud have allowed many things to get out from under the CIO’s control. Velocity helped usher in the era of the Coded Business. (Hat tip to Chef for this particular turn of phrase.)
A Coded Business uses tight feedback loops to drive change quickly. Those using Agile practices or fans of cybernetics may already be familiar with the concept of feedback loops — it is certainly not a new idea. But the reality is that most organizations operate on slow, low-quality, or nonexistent feedback loops. Tight, intelligent feedback loops have certain core characteristics:
- The very nature of feedback is that it is instrumented — the output of any system or activity is measured and the results of those measurements are used to adjust behaviors.
- Tight feedback loops rely increasingly on automation. Automation once implied simply using computers to reduce headcount and cost; now it means freeing up people (and computing resources) to focus on more interesting, innovative things. As Tim O’Reilly recently said about Uber, “in the coming world of sensor-driven applications and devices, an increasing number of services will automatically infer sufficient context to take actions like payment on our behalf.”
- They exist in both technical systems and social systems. Or more precisely, those who employ this kind of feedback recognize that a successful organization cannot treat technical and social systems as separate.
Ultimately, what this means is that these kinds of feedback loops make it possible to create radically improved business services.
As the stories of these kinds of companies are being more broadly told outside core tech media outlets, it is becoming clear that the way of doing things on the web that we’ve been discussing at Velocity has profoundly influenced how businesses (and other organizations) operate, online or otherwise. A fantastic example of this comes from a presentation by Mike Bracken, executive director of digital in the cabinet office for GOV.UK, on how government organizations need to move from an antiquated, broken policy-driven mindset to a service delivery model.
A service delivery model goes hand in hand with tight feedback loops. Smaller, composable, loosely coupled services are more easily instrumented, and can be adjusted more quickly. We need only look back on healthcare.gov to see the opposite principles in action.
The Velocity feedback loop
In response to what we’re seeing in the industry and feedback we’ve received, we’re changing the scope of the Velocity conference. For some time, we’ve been saying that organizations that view technology as an entity separate from other aspects of their core mission are at a significant disadvantage. Web-scale approaches have upended entire business models, such as Uber and the cab market. There’s a vast number of organizations both small and large that need to understand this new reality in order to survive. And in some cases, the stakes are very high. Government, social service providers, large enterprises responsible for critical utility and national infrastructure services — the list goes on. We need to reach a broader audience. Those who build the technical solutions need to better understand the business needs they are supporting, and vice versa. We want to carve out a space for both.
I want to emphasize this importance when it comes to large enterprise organizations and government. They both are noticing what has happened in the web-first business arena, and they want that for themselves. They want shorter product cycles and the ability to react nearly immediately to changes in the market or public policy — most importantly, they are realizing that their old methodologies and management styles don’t work any longer. They are realizing, like it or not, that they are fundamentally in the business of software (yes, even the government).
As such, we are reorganizing the conference around the themes we see as most important to helping modern, Coded Businesses thrive. Over the next few months I’ll be exploring each of these themes in much more detail here at Radar and through our weekly newsletter. For now, here is a sketch of what’s to come.
Optimization has been a core principle on the web for some time now — from Amazon’s early forays into A/B testing (which are now considered the norm) to Google’s intensive front-end efforts to wring every millisecond of performance out of web pages, to operational dashboards packed with metrics tracking upwards of hundreds of releases per day. These efforts are, fundamentally, tighter feedback loops. It is no coincidence that the Lean and Agile movements — which also seek to reduce cultural and organization feedback loops as well — were burgeoning at the same time.
Our notion of performance has evolved significantly over the past many years — it’s not just about reducing latency and round trips, we have to take a user-centered approach. Wringing every millisecond out of page load times leaves little room for design or marketing considerations. This means understanding what people want to see, and when, and balancing those needs against business requirements. More importantly, it requires that designers, developers, and product managers need to develop and use common language for creating beautiful, compelling, performant products. In her just-released book, Designing for Performance, Lara Hogan explores these ideas and suggests some ways forward.
We are also looking back from the technical web front end and asking what are the performance and operational implications of our low-level network infrastructure. What happens when you hand off your packets to the network? There are many important questions here. How can you use new technologies like software-defined networks and network function virtualization to make your infrastructure more reliable? What roles do CDNs and other intermediaries play in delivering data to our users?
The notion of tuning technical implementations throughout the stack shouldn’t be novel at this point, but something unique came out of years of deeply technical discussions at Velocity. Call it DevOps, call it culture change — the labels, while in some cases distracting, are less important than the concept that technological change is impossible without cultural and organizational change. The cruel irony of more and more companies needing to embrace technology in order to stay competitive is that the specific tool or technology doesn’t matter. Ultimately, every software problem is fundamentally a people problem. Just as software architectures have been radically reinvented over the last decade of web-scale innovation, so too must organizational architectures be re-envisioned. Any organization — whether for profit or in the public sector — simply cannot move at the pace of current technological systems without adjusting their social systems.
But DevOps is only the first of many potential *Ops approaches that are breaking down silos within organizations — we’d like to see that mentality cut loose across even more silos. Businesses can’t adopt radical new technical architectures or infrastructures without changing their cultural structures as well.
IT drives business
Gone are the days of the “Department of No.” The age-old conflict between development’s role of quickly pushing out features and products while the operations team tries to rein in changes is in the rear-view mirror. Companies that re-envision traditional IT as a service provider are able to move more quickly, release more often, and adjust strategy as needed.
In a recent study by Forrester, only 17% of Fortune 1000 company IT leaders said they can execute on IT service delivery as fast as they’d like. Scattered around the Internet lie the fossils of Blockbuster, Circuit City, and quite possibly very soon, a huge number of cab companies — all unable to adjust quickly enough to the ideas and rapid progress of their competitors. Even more recently, as Apple unveiled its Apple Pay service (having released key aspects of it in smaller chunks over time), the Merchant Customer Exchange group — which represents a huge number of retailers — is by all accounts having issues releasing a purportedly competing product. We can’t know the full details of why, but it sounds like an inability to deliver quickly enough due to what I suspect are outdated software development approaches to a product that seems designed more to help retailers avoid credit card fees than make purchasing easier for consumers.
Needing to out-maneuver competitors is nothing new to businesses, but the pace at which this is currently happening is. This pace has put pressures on organizations, many of which have responded by putting more power in the hands of developers themselves, and breaking down organizational walls that in the past served only to slow everything down. O’Reilly’s Andrew Odewahn documented his thoughts and experiences with this in the Field Guide to the Distributed Developer’s Stack. The key ideas being that the cloud is the default platform, deployment of code is automated, the infrastructure is built and maintained in code, and monitoring is critical.
Lastly, there are two areas we’re watching that are undergoing similar transformations to the DevOps movement — networking and security. Networking appears to be one of the last untouched bastions when it comes to infrastructure as code, and virtualization has already come knocking on its door. Network administrators that have to deal with private cloud implementations now have new virtual topologies they can’t ignore. As for security, SecDevOps (or is it DevSecOps?) is already emerging as yet another abbreviated portmanteau, and for good reason. As it used to be with web operations before DevOps, security has typically functioned as a group that is (rightfully so or not) perceived as slowing everything down right before it should be deployed. As Dan Kaminsky said in his Velocity keynote in 2013: “Everyone else has adapted. Why not security?”
In less than a decade, we have moved from only being able to reach people via a small handful of browsers on personal computers to people having miniaturized personal computers in their phones, cars, houses, watches, and clothing. We’ve already passed the point where mobile device usage overtook desktop usage — the Internet is nearly everywhere and in increasingly fragmented modes of delivery. Users don’t care if something is on the web, or is web-like, or is native, so much as they care that it is blazing fast and available wherever and whenever they like. Take a website like the Guardian — this news outlet reaches more than 110,000 million users, on more than 7,000 devices. Achieving that feat goes well beyond merely implementing responsive design (which is no panacea, it turns out, especially when it comes to speed).
One more recent development in attempting to deal with the Internet of Everything is edge computing — moving compute resources, especially those dealing with data, closer to the end user. In some ways this is an evolution of both the cloud and traditional CDNs, but it is still a bit hard to separate the hype from the fog. Regardless, I’m particularly interested in what the infrastructure will look like as data center and networking demands continue to increase.
Web and app developers have generally dealt with unreliable or unavailable network access by carving out offline functionality that still works in the absence of a connection. It’s clearly an important part of delivering as ubiquitously as possible, but we’ve been watching developments in peer-to-peer (e.g. WebRTC) and wireless mesh networking (WMN) with even greater interest. WMN in particular isn’t necessarily anything new, it’s been around for some time especially in military applications. The advantages for mobile developers are clear, but not easily achieved — WMN is an end-run around centralized ISPs but it carries plenty of engineering and operational challenges. The initial batch of consumer apps seem to be (sadly) focused on chat, but applications ranging from disaster recovery to increasing highway safety are things we’re keeping an eye on.
Deliberately unstable systems
A recent Gartner report predicted that “By 2017, 70 percent of successful digital business models will rely on deliberately unstable processes designed to shift as customer needs shift.” The emphasis there is mine. This is a prediction that wouldn’t happen were it not for the ideas that were shared at Velocity over the past many years. In 2013, Jeremy Edberg discussed the role of the “Simian Army” at Netflix — a suite of tools, including Chaos Monkey, designed to specifically mimic outages in Amazon AWS ranging from a particular instance all the way up to an entire AWS region. The message was clear: expect failure, and design, build, and test for it.
That’s all fine and good, but especially if you don’t have the money and engineering power behind Netflix, how do you go about anticipating failure and planning for it? Of course, the answer is: it’s hard. Many of the ideas and actions related to complex, unstable systems are inherently counterintuitive. (If right now you are thinking that your code, software, or system isn’t that complex, ask yourself: do you include any third-party services? If yes, how much do you know about them?) And as our industry moves to more distributed models, like microservices, the complexity will only increase.
The first step is to embrace monitoring. In a complex system, you can’t understand all the moving parts — hell, you don’t even know what all the parts are — but you can get information on them. The monitoring marketplace itself reflects how complex and distributed things have become. There has been an explosion of tools in the last two to three years related to monitoring. Gone are the monolithic monitoring stacks that were a nightmare to use (in particular with distributed systems), now replaced by a dizzying array of both proprietary and open source options — what Jason Dixon has dubbed “composable monitoring.” As I said earlier, which tool(s), doesn’t matter — just pick some and get started. If you don’t, you’ll be in the shoes that Mikey Dickerson wore when he showed up to help revive healthcare.gov and he discovered that “There was no place to look to see if the site was up or not, except CNN.”
Next, you have to plan for what you’ll do when things do go wrong. Your systems will fail, people will make mistakes. How you respond when the inevitable happens can make the difference between a small amount of cash and millions of dollars lost; between a chance to make your brand shine and a full-scale PR disaster.
Ultimately, dealing with these kinds of systems means we need to get better at thinking about systems. It’s altogether too easy to focus on the individual pieces over which we (think we) have control. The very notion of control emphasizes a metaphor of humans commanding machines, when we in reality co-operate with them on a daily basis instead. Yes, we’ve started treating our servers like cattle instead of pets, but with that comes the risk of not knowing enough about what’s happening as the farm gets much, much bigger.
The landscape for the Coded organizations of the future is complex and constantly shifting, and there is no straight path through. The route is to instrument, optimize, learn, and repeat. I’ll be digging into many of these topics in more detail over the coming months, but I’m curious to hear how much of this rings true — is your organization doing any of these things, or possibly all? What are the obstacles to putting such concepts and technologies in practice in your daily work? Leave a comment, email me at firstname.lastname@example.org, or find me on Twitter.