"sysadmin" entries

Ghosts in the machines

The secret to successful infrastructure automation is people.

ghost_machine

“The trouble with automation is that it often gives us what we don’t need at the cost of what we do.” —Nicholas Carr, The Glass Cage: Automation and Us

Virtualization and cloud hosting platforms have pervasively decoupled infrastructure from its underlying hardware over the past decade. This has led to a massive shift towards what many are calling dynamic infrastructure, wherein infrastructure and the tools and services used to manage it are treated as code, allowing operations teams to adopt software approaches that have dramatically changed how they operate. But with automation comes a great deal of fear, uncertainty and doubt.

Common (mis)perceptions of automation tend to pop up at the extreme ends: It will either liberate your people to never have to worry about mundane tasks and details, running intelligently in the background, or it will make SysAdmins irrelevant and eventually replace all IT jobs (and beyond). Of course, the truth is very much somewhere in between, and relies on a fundamental rethinking of the relationship between humans and automation.
Read more…

The DevOps identity crisis

Why DevOps needs a manifesto after all, but may never get one.

Image: CC BY-SA 2.0 Libby Levi for opensource.com

DevOps is everywhere! The growth and mindshare of the movement is remarkable. But if you care deeply about DevOps, you might agree with me when I say that although its moment has “arrived,” DevOps is in serious trouble. The movement is fragmented and weakly defined, and is being usurped by those who care more about short-term opportunities than the long-term viability of DevOps.

They are the ninety-nine percent, and nobody cares

How bad could it be? Travel back in time. It is mid-November 2011, and Occupy Wall Street is occupying the headlines. One of the major reasons is that the protestors are targeting shipping ports on the West Coast, causing shutdowns and even violence. As things are getting out of hand, parts of the movement start condemning these actions as counter-productive, hurting the 99% instead of the intended 1%. Spokespeople for the movement are quoted in the media as saying the instigators “don’t represent the movement.”

Why did the Occupy movement become a footnote in history so fast? There were several reasons: there was no cohesive agreement on its identity, values, goals, and mission; in an effort to be unlike “them,” the OWS proponents avoided anything that looked like centralized leadership; and it seemed to be entirely negative, advocating nothing to replace what it wanted to remove.

I believe a similar thing is happening to DevOps right now, for many of the same reasons. Let’s talk about some of these problems.

Read more…

Chop wood, carry water

The daily work of building and deploying complex software.

axe

You are desperate to communicate, to edify or entertain, to preserve moments of grace or joy or transcendence, to make real or imagined events come alive. But you cannot will this to happen. It is a matter of persistence and faith and hard work. So you might as well just go ahead and get started. — Anne Lamott, bird by bird

Words like ‘persistence’ and ‘work’ rarely show up in the same sentence as DevOps. It is more likely to be characterized as that One Weird Trick that will suddenly make your entire software development and deployment pipeline work faster and without failure every time. There are DevOps consultants and entrepreneurs — people and companies promising jetpacks and hovercrafts delivered on schedule and with cost savings. This is the software equivalent of the legendary savant writer struck by divine genius, churning out perfect page after page without fail and swimming in millions from her bestselling novels.

In reality, DevOps is quite similar to writing — it requires concerted, daily effort — but instead of sitting down to write every day, the principles look something more like:

  • Release regularly
  • Release in small chunks
  • Test in production (or, less provocatively: do not expect to release something perfect all the time)
  • Collect system/performance data and user feedback
  • Refine, optimize, repeat

There is no mad genius about writing, just as there is no secret sauce of DevOps. Read more…

Four short links: 1 September 2014

Four short links: 1 September 2014

Sibyl, Bitrot, Estimation, and ssh

  1. Sibyl: Google’s System for Large Scale Machine Learning (YouTube) — keynote at DSN2014 acting as an intro to Sibyl. (via KD Nuggets)
  2. Bitrot from 1997That’s 205 failures, an actual link rot figure of 91%, not 57%. That leaves only 21 URLs as 200 OK and containing effectively the same content.
  3. What We Do And Don’t Know About Software Effort Estimation — nice rundown of research in the field.
  4. fabric — simple yet powerful ssh library for Python.

Exploring lightweight monitoring systems

Toward unifying customer behavior and operations metrics.

lightweight_systemsFor the last ten years I’ve had a foot in both the development and operations worlds. I stumbled into the world of IT operations as a result of having the most UNIX skills in the team shortly after starting at ThoughtWorks. I was fortunate enough to do so at a time when many of my ThoughtWorks colleagues and I where working on the ideas which were captured so well in Jez Humble and Dave Farley’s Continuous Delivery (Addison-Wesley).

During this time, our focus was on getting our application into production as quickly as possible. We were butting up against the limits of infrastructure automation and IaaS providers like Amazon were only in their earliest form.

Recently, I have spent time with operations teams who are most concerned with the longer-term challenges of looking after increasingly complex ecosystems of systems. Here the focus is on immediate feedback and knowing if they need to take action. At a certain scale, complex IT ecosystems can seem to exhibit emergent behavior, like an organism. The operations world has evolved a series of tools which allow these teams to see what’s happening *right now* so we can react, keep things running, and keep people happy.

At the same time, those of us who spend time thinking about how to quickly and effectively release our applications have become preoccupied with wanting to know if that software does what our customers want once it gets released. The Lean Startup movement has shown us the importance of putting our software in front of our customers, then working out how they actually use it so we can determine what to do next. In this world, I was struck by the shortcomings of the tools in this space. Commonly used web analytics tools, for example, might only help me understand tomorrow how my customers used my site today.

Read more…

5 unsung tools of DevOps

A Free Velocity Report

cover-5-unsung-toolsWhen I first started as a sysadmin many years ago, I quickly realized what a daunting task was before me. Like any good engineer, I took to finding the right tools to keep at hand to make light work out of the most difficult situations. This in itself was quite an endeavor, as over the years there has been a proliferation of tools and scripts. Many are of the artisanal, organic, hand-crafted variety, forged out of bash pipelines by our forefathers.

Much of that has changed now as the DevOps movement strengthens. With closer interaction between developers and operations, or even operations teams composed of developers, the tools have significantly improved. Treating infrastructure as code with automated configuration management and provisioning tools have freed many from the menial tasks of creating snowflake systems, and we’ve turned our attention to the more important matters of scaling and optimizing our systems.

In my Velocity report, 5 Unsung Tools of DevOps, I highlight a few of the tools that have gone unnoticed—or at least unrecognized—for some time. These are but a few of the tools that recognized needs early on and that successfully solve real-world problems that you’re likely to encounter today. Here is a brief synopsis:

Read more…

Early-Warning Is an Unknown Unknown

Velocity 2013 Speaker Series

In 2002, US Secretary of State Donald Rumsfeld told a reporter that not only don’t we know everything important, but sometimes we don’t even know what knowledge we lack:

There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know.

One of the purposes of monitoring is to build early-warning systems to alert of problems before they become serious. But how can we recognize a failure in its early stages? It’s a thorny question.

Read more…

Velocity Report: Building a DevOps Culture

DevOps is as much about culture as it is about tools.

Operations professionals live in a wind tunnel. If you can imagine one of those game show glass boxes, where a contestant stands inside, the door shuts, and money blows around in a whirlwind, you’ve got a good idea of what Operations feels like much of the time. While you’re trying to grab one technology, another has forced itself across your eyes demanding attention.

The incredible growth of an industry that didn’t really even exist fifteen years ago has provided us with endless opportunity and innovations. It’s also required us to be on the forefront of many new technologies in a way other professions aren’t. The constant drive towards the next technology, the next platform, and the next idea has stratified our organizations, creating specializations in areas like networking, storage, security, data sciences, and a myriad of other functions that challenge our ability to work with our colleagues as a cohesive team.

Read more…

What's New in CFEngine 3: Making System Administration Even More Powerful

CFEngine is a surprisingly flexible and fast tool for distributed configuration management. A new version was released this week.

What’s New in CFEngine 3: Making System Administration Even More Powerful

CFEngine is a surprisingly flexible and fast tool for distributed configuration management. A new version was released this week.