Process Is Not a Four-Letter Word

Standardization done right can save your sanity and improve your culture

Capital-P “Process” ™ is something many software developers, operations engineers, system administrators, and even managers love to hate.

It is often considered a productivity-killing, innovation-stifling beast whose only useful domain is within the walls of some huge, hulking enterprise or sitting in a wiki nobody ever reads.

I have always found distaste for process fascinating and now even moreso that configuration management and version control have become such core tenets of the DevOps movement. The main purpose of those tools is to provide structure for software development and operations to increase reproducibility, reliability, and standardization of those activities.

Standardization is one one of the four “-ations” discussed in Is Your Team Instrument Rated?, a talk I gave at DevOpsDays exploring the various lessons aviation has learned which are useful in a software development and operations context. In it, I looked at how the National Airspace System structures and describes some of the processes it uses to ensure that flying Delta across the country is pretty much like flying on United (from a safety and operational perspective, anyway).

Taking the time to reexamine your organization’s approach to process and standardization is not just a tool for realizing the benefits big airlines enjoy; it’s a way to evaluate your engineering department’s culture, while recording and communicating what is considered important for engineers to think about and optimize for as they go about their work.

As such, it’s a useful exercise for beginning (or continuing) the transformation towards that ever elusive DevOps culture.

Don’t Start With Procedures

When documenting an approach procedure into an airport, the Federal Aviation Administration (FAA) doesn’t tackle each one from first principles.

They have a document (lovingly referred to as the “TERPS“) that contains what I call operational primitives. The TERPS is a long, complex list of requirements that each approach procedure must fulfill–when a procedure does not meet the requirements, the approach description must explicitly note, for instance, that two navigation receivers are required or it is only approved within a certain temperature range.

Distinguishing between your own operational primitives versus operational processes is a useful technique to begin recording the engineering values that are important to your organization.

Take the common deployment problem of disparate artifact handling: one team uses RPMs, another tarballs, and yet another team zip files of .jars. Many of us have dealt with this exact anti-pattern and know the problems and complexities it causes. A possible operational primitive to tackle the issue might be along the lines of “Artifacts will be published in–and deployments will only manage–a single package format.” Note that this is different from “We’ll all use RPMs and a Yum repository.”

The distinction may seem semantic, but it’s not: it prompts a different type of thinking. When engineers are asked to solve a problem, we often scope the solution locally, making optimizations through a localized perspective. There’s nothing inherently wrong with that (in fact, it’s efficient), but locally optimized solutions can, over time, introduce system-wide inconsistencies and inefficiencies that can be quite costly.

Having a set of primitives helps prompt us, as engineers, to design our components with some uniformity, and helps us think about the various cases, at a systemic level, that we should address or account for. In the example above, it might make us think about the inputs and outputs of each “station” of the deployment pipeline, and may even raise exceptions, such as when a certain (commercial, perhaps) component can’t use the standard packaging format, due to licensing reasons.

Of course, just as the TERPS doesn’t answer how to get to a specific runway on a classical foggy San Francisco morning, defining these primitives doesn’t absolve your teams of sorting out the specifics of their own processes. But it can serve as a template for writing new processes, a guide on how to structure the concerns and specific steps of a process, and a good list of items to validate after it’s written, a “unit test” of sorts.

In addition, not having each engineer create operational processes from first principles means they’ll take less time to develop and will generally be safer, because they include (implicitly) institutional knowledge the organization has gained. And, of course, using the forum in which you maintain and discuss the primitives is a great way for teams to share knowledge and collaborate, even when they’re in different buildings or countries.

Finally, establishing operational primitives allows engineers to express deviations from those base requirements: instead having to use git blame six months later and cursing the engineer listed (which in personal experience, tends to be me), this prompts engineers to document the reasons they deviated from the established operational primitives and norms when they make that decision. This makes it it an effective tool for communicating intentions and the development context to future developers and ops engineers working with that code.

Planes, Trains, and Commits!

All of this may sound great in the context of aviation, but what areas of software development does it apply to? Some good areas to start looking at include:

  • Code line management: do you have a standard process for creating feature and release branches? Is there an expected naming convention? Are merge practices defined? Does your central repository validate that pushes/commits follow these guidelines?

In addition to helping keep your repositories clean and easy to understand and navigate, standards around code-line management allow other teams to easily find code their colleagues are working on: the QA and Ops staff will know where to look when collaborating with developers. They also allow you to write automation that can actually make (valid!) assumptions about where code lives.

Here, the primitive upon which your processes would be based is acknowledgment that your repository is a sandbox everyone plays in, and if the goal is to work together to build a sand castle, there need to be some rules about how to accomplish that collaboratively! (Also: no throwing sand!)

  • Deployment requirements: can you describe what is required for any artifact to go into an environment, such as staging or production? (Is that list different? Should it be different?)

Such questions may be more familiar under the name “operational requirements” and they are typically pain points, because the knowledge is often scattered between the developers and operations engineers and there isn’t a good way to communicate or collate them, or have that information recorded. A standardization exercise is a great place to tackle this problem.

Items ripe for validation might include ensuring module or database schema versions (you’re versioning that stuff, right?) match expectations, or certain keys/configuration values are off (or on!) in specific environments.

Once documented, unit tests can be written to actually validate your assumptions about your deployment process and more importantly: raise a flag when they unexpectedly change.

The base requirement here is an assertion that when it comes to certain organizational tasks, it’s not appropriate for each team to be a unique snowflake and that everyone buys into paying–and planning for–certain development costs.

A common complaint about undertaking this exercise is that it produces a list of “rules” that are often seen as static and too brittle to accommodate modern software development. But it doesn’t need to be that way. The FAA regularly revisits its operational primitives to account for new technology developments; approach procedures receive updates to account for changing airport and airspace conditions. And so can we.

Organizations that value standardization as part of their culture should plan to revisit their base requirements every, say, six months, or every 50 hires, whichever comes first. This ensures that process evolves with the organization and that it’s still serving the organization well.

Process Does Not Have To Be a Dirty Word

There’s no doubt about it: the software development world is full of bad capital-P Process. It’s the reason most of us sigh and roll our eyes when we hear the word.

But by taking a look at how other disciplines approach the creation, implementation, and most importantly acculturation of process in their own organizational ecosystems, we can gain a fresh new perspective on our own development of process within a software development and operations context.

Process and standardization is what allows planes to safely crisscross the world every day with an infinitesimally small error rate; it’s what allows an airport to continue to operate in poor weather.

And one of the secrets of the software development shops that often serve as DevOps poster children: to them, process isn’t a bad word, and the practice of mindful standardization is how they scale their code and their engineering culture to win.

Of course, understanding your company’s operational primitives and standardizing your teams’ processes is of little value if they can’t be communicated; we’ll tackle how aviation has addressed that particular ‘-ation’ in the next column.

tags: , , ,

Get the O’Reilly Web Ops and Performance Newsletter

Weekly insight from industry insiders. Plus exclusive content and offers.