Today, Jonathan Heiliger, VP of Operations at Facebook, and his team announced the Open Compute Project, releasing their data center hardware stack as open source. This is a revolutionary project, and I believe it’s one of the most important in infrastructure history. Let me explain why.
The way we operate systems and datacenters at web scale is fundamentally different than the world most server vendors seem to design their products to run in.
Web-scale systems focus on the entire system as a whole. In our world, individual servers are not special, and treating them as special can be dangerous. We expect servers to fail and we increasingly rely on the software we write to manage those failures. In many cases, the most valuable thing we can do when hardware fails is to simply provision a new one as quickly as possible. That means having enough capacity to do that, a way of programmatically managing the infrastructure, and an easy way to replace the failed components.
The server vendors have been slow to make this transition because they have been focused on individual servers, rather than systems as a whole. What we want to buy is racks of machines, with power and networking preconfigured, which we can wheel in, bolt down, and plug in. For the most part we don’t care about logos, faceplates, and paint jobs. We won’t use complex integrated proprietary management interfaces, and we haven’t cared about video cards in a long time … although it is still very hard to buy a server without them.
This gap is what led Google to build their own machines optimized for their own applications in their own datacenters. When Google did this, they gained a significant competitive advantage. Nobody else could deploy as much compute power as quickly and efficiently. To complete with Google’s developers you also must compete with their operations and data center teams. As Tim O’Reilly said: “Operations is the new secret sauce.”
When Jonathan and his team set out to build Facebook’s new datacenter in Oregon, they knew they would have to do something similar to achieve the needed efficiency. Jonathan says that the Prineville, Ore. data center uses 38% less energy to do the same work as Facebook’s existing facilities, while costing 24% less.
Facebook then took the revolutionary step of releasing the designs for most of the hardware in the datacenter under the Creative Commons license. They released everything from the power supply and battery backup systems to the rack hardware, motherboards, chassis, battery cabinets, and even their electrical and mechanical construction specifications.
This is a gigantic step for open source hardware, for the evolution of the web and cloud computing, and for infrastructure and operations in general. This is the beginning of a shift that began with open source software, from vendors and consumers to a participatory and collaborative model. Jonathan explains:
“The ultimate goal of the Open Compute Project, however, is to spark a collaborative dialogue. We’re already talking with our peers about how we can work together on Open Compute Project technology. We want to recruit others to be part of this collaboration — and we invite you to join us in this mission to collectively develop the most efficient computing infrastructure possible.”
At the announcement this morning, Graham Weston of Rackspace announced that they would be participating in Open Compute, which is an ideal compliment to the OpenStack cloud computing projects. Representatives from Dell and HP spoke at the announcement and also said that they would participate in this new project. The conversation has already begun.