Questioning the Lambda Architecture

The Lambda Architecture has its merits, but alternatives are worth exploring.

Nathan Marz wrote a popular blog post describing an idea he called the Lambda Architecture (“How to beat the CAP theorem“). The Lambda Architecture is an approach to building stream processing applications on top of MapReduce and Storm or similar systems. This has proven to be a surprisingly popular idea, with a dedicated website and an upcoming book. Since I’ve been involved in building out the real-time data processing infrastructure at LinkedIn using Kafka and Samza, I often get asked about the Lambda Architecture. I thought I would describe my thoughts and experiences.

What is a Lambda Architecture and how do I become one?

The Lambda Architecture looks something like this:

Lambda_Architecture Read more…

Comments: 21

Revisiting “What is DevOps”

If all companies are software companies, then all companies must learn to manage their online operations.

DevOpsBirds

Two years ago, I wrote What is DevOps. Although that article was good for its time, our understanding of organizational behavior, and its relationship to the operation of complex systems, has grown.

A few themes have become apparent in the two years since that last article. They were latent in that article, I think, but now we’re in a position to call them out explicitly. It’s always easy to think of DevOps (or of any software industry paradigm) in terms of the tools you use; in particular, it’s very easy to think that if you use Chef or Puppet for automated configuration, Jenkins for continuous integration, and some cloud provider for on-demand server power, that you’re doing DevOps. But DevOps isn’t about tools; it’s about culture, and it extends far beyond the cubicles of developers and operators. As Jeff Sussna says in Empathy: The Essence of DevOps:

…it’s not about making developers and sysadmins report to the same VP. It’s not about automating all your configuration procedures. It’s not about tipping up a Jenkins server, or running your applications in the cloud, or releasing your code on Github. It’s not even about letting your developers deploy their code to a PaaS. The true essence of DevOps is empathy.

Read more…

Comments: 4

Open teaching stacks help us teach at scale

Use teaching stacks to drive growth.

Elliott Hauser is CEO of Trinket, a startup focused on creating open sourced teaching materials. He is also a Python instructor at UNC Chapel Hill.

Well-developed tools for teaching are crucial to the spread of open source software and programming languages. Stacks like those used by the Young Coders Tutorial and Mozilla Software Carpentry are having national and international impact by enabling more people to teach more often.

The spread of tech depends on teaching

Software won’t replace teachers. But teachers need great software for teaching. The success and growth of technical communities are largely dependent on the availability of teaching stacks appropriate to teaching their technologies. Resources like try git or interactivepython.org not only help students on their own but also equip instructors to teach these topics without also having to discover the best tools for doing so. In that way, they play the same function as open source Web stacks: getting us up and running quickly with time-tested and community-backed tools. Thank goodness I don’t need to write a database just to write a website; I can use open source software instead. As an instructor teaching others to code websites, what’s the equivalent tool set? That’s what I mean by Teaching Stack: a collection of open tools that help individual instructors teach technology at scale.

Elements of a great teaching stack

Here are some of the major components of a teaching stack for a hands-on technology course:

Read more…

Comment

A growing number of applications are being built with Spark

Many more companies want to highlight how they're using Apache Spark in production.

One of the trends we’re following closely at Strata is the emergence of vertical applications. As components for creating large-scale data infrastructures enter their early stages of maturation, companies are focusing on solving data problems in specific industries rather than building tools from scratch. Virtually all of these components are open source and have contributors across many companies. Organizations are also sharing best practices for building big data applications, through blog posts, white papers, and presentations at conferences like Strata.

These trends are particularly apparent in a set of technologies that originated from UC Berkeley’s AMPLab: the number of companies that are using (or plan to use) Spark in production1 has exploded over the last year. The surge in popularity of the Apache Spark ecosystem stems from the maturation of its individual open source components and the growing community of users. The tight integration of high-performance tools that address different problems and workloads, coupled with a simple programming interface (in Python, Java, Scala), make Spark one of the most popular projects in big data. The charts below show the amount of active development in Spark:

Apache Spark contributions

For the second year in a row, I’ve had the privilege of serving on the program committee for the Spark Summit. I’d like to highlight a few areas where Apache Spark is making inroads. I’ll focus on proposals2 from companies building applications on top of Spark.

Read more…

Comment

What it really means when people say “Everything in JavaScript is an object”

A new mantra for your next (programming) meditation session.

When you begin programming with JavaScript you might run across books, tutorials, and people who say “Everything in JavaScript is an object.” While it’s not 100% true (not *everything* is an object), it is *mostly* true. And sometimes this can be a bit surprising.

For instance, to most people functions and objects look and act completely different. And in many languages, functions and objects *are* completely different. However, in JavaScript, a function is an object. This can take a bit of concentrated attention to get your head around, but it’s an important concept because it’s the secret behind another big topic in JavaScript: functions as first class values.

Read more…

Comment

It’s the end of the web as we knew it

You might feel fine.

For the past 15 years, Google has enforced the classic “HTML as foundation” architecture at the heart of the Web. Content creators and the developers who support them had to present content and link information as part of their pages’ HTML if they wanted Google’s spidering bots to see them. Google effectively punished developers who made links or content available only through JavaScript (or images, or CSS), giving them low or non-existent search results.

Google did this to keep their processing simple, not because of a deep fondness for HTML. Even as Google’s bots stuck to a simple diet of HTML, other parts of Google were developing JavaScript-centric approaches, like AngularJS: a “Superheroic JavaScript MVW Framework” that “is what HTML would have been, had it been designed for building web-apps.”

Angular is far from alone. As JavaScript grew, more and more programmers wanted to build their apps as programs, not as pages. Or, as Jen Simmons summarized it at Fluent, “Dang that stupid HTML, I’m just going to load an empty page… then I’ll run the real program, I’ll run the JavaScript.” Read more…

Comment