FEATURED STORY

Four short links: 3 July 2015

Storage Interference, Open Source SSL, Pub-Sub Reverse-Proxy, and Web Components Checklist

  1. The Storage Tipping Pointthe performance optimization technologies of the last decade – log structured file systems, coalesced writes, out-of-place updates and, soon, byte-addressable NVRAM – are conflicting with similar-but-different techniques used in SSDs and arrays. The software we use is written for dumb storage; we’re getting smart storage; but smart+smart = fragmentation, write amplification, and over-consumption.
  2. s2n — Amazon’s open source ssl implementation.
  3. pushpina reverse proxy server that makes it easy to implement WebSocket, HTTP streaming, and HTTP long-polling services. It communicates with backend web applications using regular, short-lived HTTP requests (GRIP protocol). This allows backend applications to be written in any language and use any webserver.
  4. The Gold Standard Checklist for Web ComponentsThis is a working draft of a checklist to define a “gold standard” for web components that aspire to be as predictable, flexible, reliable, and useful as the standard HTML elements.
Comment

To create the future we want, we need more moonshots

The O'Reilly Radar Podcast: Tim O'Reilly and Astro Teller talk about technology and society, and the importance of moonshots.

Eclipsiographia_Paul_K_Flickr

Subscribe to the O’Reilly Radar Podcast to track the technologies and people that will shape our world in the years to come.

In this week’s Radar Podcast episode, Tim O’Reilly sits down with Google X’s Astro Teller. Their wide-ranging conversation covers moonshots, the relationship between technology and society, the learning process for hardware, and more. What follows are some snippets of their conversation to whet your appetite — you can listen to the entire interview in the SoundCloud player below, or download the podcast through Stitcher, TuneIn, or iTunes.

Technology doesn’t create net losses for the economy

Tim O’Reilly: The policy makers, I think, need to stop talking about creating jobs and start talking about the work we need to do in the world, because if you do that work, you do create jobs. I was struck by this when I went to Mount Vernon, George Washington’s home. He was really into scientific agriculture, as was Thomas Jefferson. He had this vision that America could feed the world. There was that economic vision: there is something that needs doing. One of the things I love about Google X is it’s driven by solving problems, and those problems actually often do create new opportunities for work.

Astro Teller: I completely agree with you about the problems. In addition, when you look at the history of technology — its introduction, and what happened in society afterword — technology has functioned in every case in the past as a lever for the human mind or for the human body. Things like the introduction of spreadsheets destroyed the business, the profession of bookkeeping — but because we trained people, we as society trained people, they became accountants, they became analysts. As many jobs as were lost were created, and more work, more productivity was created in the process. The bulldozer took away, in a very analogous way, a lot of jobs from people who were digging with shovels, but because we trained them to do things like build the bulldozers, drive the bulldozers, maintain the bulldozers, it wasn’t a net loss for the economy.

I believe that the failure mode we are currently in, to the extent that there’s a failure mode, is not the introduction of new technologies but the failure of our society to train the young people of the world so that they will be prepared to use these more and more sophisticated levers.

Read more…

Comment

Why data preparation frameworks rely on human-in-the-loop systems

The O'Reilly Data Show Podcast: Ihab Ilyas on building data wrangling and data enrichment tools in academia and industry.

Celtic_Design_022_Paul_K_Flickr

As I’ve written in previous posts, data preparation and data enrichment are exciting areas for entrepreneurs, investors, and researchers. Startups like Trifacta, Tamr, Paxata, Alteryx, and CrowdFlower continue to innovate and attract enterprise customers. I’ve also noticed that companies — that don’t specialize in these areas — are increasingly eager to highlight data preparation capabilities in their products and services.

During a recent episode of the O’Reilly Data Show Podcast, I spoke with Ihab Ilyas, professor at the University of Waterloo and co-founder of Tamr. We discussed how he started working on data cleaning tools, academic database research, and training computer science students for positions in industry.

Academic database research in data preparation

Given the importance of data integrity, it’s no surprise that the database research community has long been interested in data preparation and data wrangling. Ilyas explained how his work in probabilistic databases led to research projects in data cleaning:

In the database theory community, these problems of handling, dealing with data inconsistency, and consistent query answering have been a celebrated area of research. However, it has been also difficult to communicate these results to industry. And database practitioners, if you like, they were more into the well-structured data and assuming a lot of good properties around this data, [and they were also] more interested in indexing this data, storing it, moving it from one place to another. And now, dealing with this large amount of diverse heterogeneous data with tons of errors, sidled across all business units in the same enterprise became a necessity. You cannot really avoid that anymore. And that triggered a new line of research for pragmatic ways of doing data cleaning and integration. … The acquisition layer in that stack has to deal with large sets of formats and sources. And you will hear about things like adapters and source adapters. And it became a market on its own, how to get access and tap into these sources, because these are kind of the long tail of data.

The way I came into this subject was also funny because we were talking about the subject called probabilistic databases and how to deal with data uncertainty. And that morphed into trying to find data sets that have uncertainty. And then we were shocked by how dirty the data is and how data cleaning is a task that’s worth looking at.

Read more…

Comment
Four short links: 2 July 2015

Four short links: 2 July 2015

Mathematical Thinking, Turing on Imitation Game, Retro Gaming in Javascript, and Effective Retros

  1. How Not to be Wrong: The Power of Mathematical Thinking (Amazon) — Ellenberg chases mathematical threads through a vast range of time and space, from the everyday to the cosmic, encountering, among other things, baseball, Reaganomics, daring lottery schemes, Voltaire, the replicability crisis in psychology, Italian Renaissance painting, artificial languages, the development of non-Euclidean geometry, the coming obesity apocalypse, Antonin Scalia’s views on crime and punishment, the psychology of slime molds, what Facebook can and can’t figure out about you, and the existence of God. (via Pam Fox)
  2. What Turing Himself Said About the Imitation Game (IEEE) — fascinating history. The second myth is that Turing predicted a machine would pass his test around the beginning of this century. What he actually said on the radio in 1952 was that it would be “at least 100 years” before a machine would stand any chance with (as Newman put it) “no questions barred.”
  3. Impossible Mission in Javascript — an homage to the original, and beautiful to see. I appear to have lost all my skills in playing it in the intervening 32 years.
  4. Running Effective RetrospectivesEach change to the team’s workflow is treated as a scientific experiment, whereby a hypothesis is formed, data collected, and expectations compared with actual results.
Comment

BioBuilder: Rethinking the biological sciences as engineering disciplines

Moving biology out of the lab will enable new startups, new business models, and entirely new economies.

Laboratory_public_domain_image_British_Library_Flickr

Buy “BioBuilder: Synthetic Biology in the Lab,” by Natalie Kuldell PhD., Rachel Bernstein, Karen Ingram, and Kathryn M. Hart.

What needs to happen for the revolution in biology and the life sciences to succeed? What are the preconditions?

I’ve compared the biorevolution to the computing revolution several times. One of the most important changes was that computers moved out of the lab, out of the machine room, out of that sacred space with raised floors, special air conditioning, and exotic fire extinguishers, into the home. Computers stopped being things that were cared for by an army of priests in white lab coats (and that broke several times a day), and started being things that people used. Somewhere along the line, software developers stopped being people with special training and advanced degrees; children, students, non-professionals — all sorts of people — started writing code. And enjoying it.

Biology is now in a similar place. But to take the next step, we have to look more carefully at what’s needed for biology to come out of the lab. Read more…

Comment

“Internet of Things” is a temporary term

The O'Reilly Radar Podcast: Pilgrim Beart on the scale, challenges, and opportunities of the IoT.

Hills_album_public_domain_Internet_Archive_Flickr

Subscribe to the O’Reilly Radar Podcast to track the technologies and people that will shape our world in the years to come.

In this week’s Radar Podcast, O’Reilly’s Mary Treseler chatted with Pilgrim Beart about co-founding his company, AlertMe, and about why the scale of the Internet of Things creates as many challenges as it does opportunities. He also talked about the “gnarly problems” emerging from consumer wants and behaviors.

Read more…

Comment