DIY Appliances on the Web?

Or, My Enterprise is Appliancized, Why Isn’t Your Web?

I wrote a couple of posts a while back that covered task-optimized hardware. This one was about a system that combined Field Programmable Gate Arrays (FPGA’s) with a commodity CPU platform to provide the sheer number crunching performance needed to break GSM encryption. This one looked at using task-appropriate efficient processors to reduce power consumption in a weather predicting super computer. In these two posts I sort of accidentally highlighted two of the three key selling points of task-specific appliances, sheer performance and energy efficiency (the third is security). The posts also heightened my awareness of the possibilities for specialized hardware and some of my more recent explorations that focused on the appliance market in particular got me wondering if there might be a growing trend toward specialized appliances.

Of course, specialized devices have been working their way into the enterprise ever since the first router left its commodity Unix host for the task-specific richness of specialized hardware. Load balancers followed soon after and then devices from companies like Layer 7 and Data Power (now IBM) took the next logical step and pushed the appliance up the stack to XML processing. These appliances aren’t just conveniently packaging intellectual property inside commodity 1U blister packs, they are specialized devices that process XML on purpose-built Application Specific Integrated Circuits (ASICS), accelerate encryption / decryption in hardware, and encapsulate most of an ESB inside a single tamper proof box whose entire OS is in firmware. They are fast, use a lot less power than an equivalent set of commodity boxes, and are secure.

Specialization is also showing up in the realm of the commodity database management systems. At last year’s Money:Tech Michael Stonebraker described a column-oriented database designed to speed access to pricing history for back testing and other financial applications. In this case the database is still implemented on commodity hardware. However, I think it’s interesting in the context of this conversation on specialized computing because it speaks to the inadequacy of commodity solutions for highly specific requirements.

A device from Netezza is also targeted at the shortcomings of the commodity DBMS. In this case the focus is on data warehousing, but it takes the concept further with an aggressive hardware design that is delivered as an appliance. It has PostgreSQL at its core but it takes the rather radical step of coupling FPGA’s directly to the storage devices. The result, for at least a certain class of query, is a multiple order of magnitude boost in performance. I think this device is noteworthy because it puts the appliance way up the stack and is perhaps a harbinger for further penetration of the appliance into application-layer territory.

While appliances are expanding their footprint in the enterprise, it seems like the exact opposite might be happening on the web? Maybe the idea of a closed appliance is anathema to the open source zeitgeist of the web, but in any case, the LAMP stack is still king. Even traditional appliance-like tasks such as load balancing seem to be trending toward open source software on commodity hardware (e.g. Perlbal).

I can’t help but wonder though, at the sheer scale that some web properties operate (and at the scale of the energy cost required to power them), can the performance and cost efficiency of specialized hardware appliances be ignored? Might there be a way to get the benefits of the appliance that is in keeping with the open source ethos of the web?

If you’ve ever uploaded a video to Youtube and waited for it to be processed you have an idea of how processor hungry video processing is on commodity hardware. I don’t know what Google’s hardware and energy costs are for that task but they must be significant. Same goes for Flickr’s image processing server farm and I would guess for Google’s voice processing now that its new speech services have launched. If the combination hardware and electricity costs are high enough, maybe this is a good place to introduce specialized appliances to the web?

But how to do that in a way that is consistent with the prevailing open source ethos and that still lets a firm continue to innovate? I think an answer might be sort of DIY writ large; a confluence of open source and open hardware that works like an undocumented joint venture based on the right to fork. Think Yahoo and the Hadoop community or JP Morgan and friends with AMQP but with hardware and you get the idea. Such a community could collaborate on the design of the ASICS and the appliance(s) that hosted them and even coordinate production runs in order to manage unit costs. Perhaps more importantly, specifying the components openly would serve cost sharing across these companies while still supporting flexibility in how they were deployed and ultimately, generativity and innovation for future uses.

There are probably a bunch of reasons why this is just silly speculation, but Google’s efforts with power supply efficiency might be seen as at least a bit of precedent for web firms dabbling in hardware and hardware specifications. In fact, Google’s entire stack, from it’s unique approach to commodity hardware, to software infrastructure like GFS, might be thought of as a specialized appliance that suits the specific needs of search. It’s just a really really big one that “ships” in a hundred thousand square foot data center form factor.

tags: , , ,