It’s not just about Hadoop core anymore

For maximum business value, big data applications have to involve multiple Hadoop ecosystem components.

Data is deluging today’s enterprise organizations from ever-expanding sources and in ever-expanding formats. To gain insight from this valuable resource, organizations have been adopting Apache Hadoop with increasing momentum. Now, the most successful players in big data enterprise are no longer only utilizing Hadoop “core” (i.e., batch processing with MapReduce), but are moving toward analyzing and solving real-world problems using the broader set of tools in an enterprise data hub (often interactively) — including components such as Impala, Apache Spark, Apache Kafka, and Search. With this new focus on workload diversity comes an increased demand for developers who are well-versed in using a variety of components across the Hadoop ecosystem.

Due to the size and variety of the data we’re dealing with today, a single use case or tool — no matter how robust — can camouflage the full, game-changing potential of Hadoop in the enterprise. Rather, developing end-to-end applications that incorporate multiple tools from the Hadoop ecosystem, not just the Hadoop core, is the first step toward activating the disparate use cases and analytic capabilities of which an enterprise data hub is capable. Whereas MapReduce code primarily leverages Java skills, developers who want to work on full-scale big data engineering projects need to be able to work with multiple tools, often simultaneously. An authentic big data applications developer can ingest and transform data using Kite SDK, write SQL queries with Impala and Hive, and create an application GUI with Hue. Read more…


Now available: Big Data Now, 2014 edition

Our wrap-up of important developments in the big data field.

In the four years we’ve been producing Big Data Now, our wrap-up of important developments in the big data field, we’ve seen tools and applications mature, multiply, and coalesce into new categories. This year’s free wrap-up of Radar coverage is organized around seven themes:

  • Cognitive augmentation: As data processing and data analytics become more accessible, jobs that can be automated will go away. But to be clear, there are still many tasks where the combination of humans and machines produce superior results.
  • Intelligence matters: Artificial intelligence is now playing a bigger and bigger role in everyone’s lives, from sorting our email to rerouting our morning commutes, from detecting fraud in financial markets to predicting dangerous chemical spills. The computing power and algorithmic building blocks to put AI to work have never been more accessible.
  • Read more…

Four short links: 26 January 2015

Four short links: 26 January 2015

Coding in VR, Git Workflows, Programming as Bookkeeping, and Valuing People

  1. How Might We Code in VR? — caught my eye because I’m looking for ideas on how to think about interaction design in the holoculus world.
  2. Git Workflows for Pros — non-developers don’t understand how important this is to productivity.
  3. All Programming is Bookkeeping — approach programming as a bookkeeping problem: checks and balances.
  4. Why I Am Not a Maker (Deb Chachra) — The problem is the idea that the alternative to making is usually not doing nothing — it’s almost always doing things for and with other people, from the barista to the Facebook community moderator to the social worker to the surgeon. Describing oneself as a maker — regardless of what one actually or mostly does — is a way of accruing to oneself the gendered, capitalist benefits of being a person who makes products.
Comment: 1

Designing on a system level

Andy Goodman on service design, embeddables, and predictive analytics.


I recently sat down with Andy Goodman, designer and group director of Fjord’s US studios. Goodman has been designing and managing design teams around the globe for the past 20 years. Goodman is a contributor to Designing for Emerging Technologies — our conversation covers embeddables, wearables, and predictive analytics. To kick off the conversation, I asked Goodman to define “service design”:

“It’s well-known that if you ask a service designer to define “service design,” you get 10 different answers. For me, it’s really about thinking on a system level about design … It’s thinking about how systems, and not just computer systems, but how human systems and computer systems and physical systems all interact with each other. You need to be thinking not about individual moments; you need to be thinking about journeys and flows, and thinking about how a human being will naturally, without even thinking about it, move from one context to another using different devices, using physical objects, being in physical spaces. For me, it was very appealing, this idea that you can design more than just interactions in a way, more than just interactions on a screen. You can actually design other things that are more about the way we live and work and play.”

Read more…


Bitcoin is just the first app to use blockchain technology

Understanding the value of the blockchain above and beyond bitcoin.


Editor’s note: Lorne Lantz is a program co-chair for our O’Reilly Radar Summit: Bitcoin & the Blockchain on January 27, 2015, in San Francisco. For more on the program and for registration information, visit the Bitcoin & the Blockchain event website.

I remember the first time I heard about bitcoin. It was June 2012, and I was invited to a bitcoin meetup. The whole time I was sitting there, I thought these were a bunch of computer geeks playing around with nerd money.

At the same time, I felt excited about the possibilities. If what the bitcoin believers were saying was true, it could become something very big. When I took a closer look, I realized why it could be so groundbreaking: decentralization.

Unlike other currencies and payment networks, bitcoin is not controlled by a bank, government, or financial institution. Instead, thousands of computers around the world verify transactions and manage a global decentralized ledger. This innovative technology is called the blockchain, and it provides a unique pathway that allows — for the first time — many computers that don’t trust each other to achieve consensus. In bitcoin’s case, they are achieving consensus on updates to the global ledger. Read more…

Comments: 6

Bringing an end to synthetic biology’s semantic debate

The O'Reilly Radar Podcast: Tim Gardner on the synthetic biology landscape, lab automation, and the problem of reproducibility.

Editor’s note: this podcast is part of our investigation into synthetic biology and bioengineering. For more on these topics, download a free copy of the new edition of BioCoder, our quarterly publication covering the biological revolution. Free downloads for all past editions are also available.

Tim Gardner, founder of Riffyn, has recently been working with the Synthetic Biology Working Group of the European Commission Scientific Committees to define synthetic biology, assess the risk assessment methodologies, and then describe research areas. I caught up with Gardner for this Radar Podcast episode to talk about the synthetic biology landscape and issues in research and experimentation that he’s addressing at Riffyn.

Defining synthetic biology

Among the areas of investigation discussed at the EU’s Synthetic Biology Working Group was defining synthetic biology. The official definition reads: “SynBio is the application of science, technology and engineering to facilitate and accelerate the design, manufacture and/or modification of genetic materials in living organisms.” Gardner talked about the significance of the definition:

“The operative part there is the ‘design, manufacture, modification of genetic materials in living organisms.’ Biotechnologies that don’t involve genetic manipulation would not be considered synthetic biology, and more or less anything else that is manipulating genetic materials in living organisms is included. That’s important because it gets rid of this semantic debate of, ‘this is synthetic biology, that’s synthetic biology, this isn’t, that’s not,’ that often crops up when you have, say, a protein engineer talking to someone else who is working on gene circuits, and someone will claim the protein engineer is not a synthetic biologist because they’re not working with parts libraries or modularity or whatnot, and the boundaries between the two are almost indistinguishable from a practical standpoint. We’ve wrapped it all together and said, ‘It basically advances in the capabilities of genetic engineering. That’s what synthetic biology is.'”

Read more…