- How Not to be Wrong: The Power of Mathematical Thinking (Amazon) — Ellenberg chases mathematical threads through a vast range of time and space, from the everyday to the cosmic, encountering, among other things, baseball, Reaganomics, daring lottery schemes, Voltaire, the replicability crisis in psychology, Italian Renaissance painting, artificial languages, the development of non-Euclidean geometry, the coming obesity apocalypse, Antonin Scalia’s views on crime and punishment, the psychology of slime molds, what Facebook can and can’t figure out about you, and the existence of God. (via Pam Fox)
- What Turing Himself Said About the Imitation Game (IEEE) — fascinating history. The second myth is that Turing predicted a machine would pass his test around the beginning of this century. What he actually said on the radio in 1952 was that it would be “at least 100 years” before a machine would stand any chance with (as Newman put it) “no questions barred.”
- Running Effective Retrospectives — Each change to the team’s workflow is treated as a scientific experiment, whereby a hypothesis is formed, data collected, and expectations compared with actual results.
The O'Reilly Data Show Podcast: Ihab Ilyas on building data wrangling and data enrichment tools in academia and industry.
As I’ve written in previous posts, data preparation and data enrichment are exciting areas for entrepreneurs, investors, and researchers. Startups like Trifacta, Tamr, Paxata, Alteryx, and CrowdFlower continue to innovate and attract enterprise customers. I’ve also noticed that companies — that don’t specialize in these areas — are increasingly eager to highlight data preparation capabilities in their products and services.
During a recent episode of the O’Reilly Data Show Podcast, I spoke with Ihab Ilyas, professor at the University of Waterloo and co-founder of Tamr. We discussed how he started working on data cleaning tools, academic database research, and training computer science students for positions in industry.
Academic database research in data preparation
Given the importance of data integrity, it’s no surprise that the database research community has long been interested in data preparation and data wrangling. Ilyas explained how his work in probabilistic databases led to research projects in data cleaning:
In the database theory community, these problems of handling, dealing with data inconsistency, and consistent query answering have been a celebrated area of research. However, it has been also difficult to communicate these results to industry. And database practitioners, if you like, they were more into the well-structured data and assuming a lot of good properties around this data, [and they were also] more interested in indexing this data, storing it, moving it from one place to another. And now, dealing with this large amount of diverse heterogeneous data with tons of errors, sidled across all business units in the same enterprise became a necessity. You cannot really avoid that anymore. And that triggered a new line of research for pragmatic ways of doing data cleaning and integration. … The acquisition layer in that stack has to deal with large sets of formats and sources. And you will hear about things like adapters and source adapters. And it became a market on its own, how to get access and tap into these sources, because these are kind of the long tail of data.
The way I came into this subject was also funny because we were talking about the subject called probabilistic databases and how to deal with data uncertainty. And that morphed into trying to find data sets that have uncertainty. And then we were shocked by how dirty the data is and how data cleaning is a task that’s worth looking at.
Moving biology out of the lab will enable new startups, new business models, and entirely new economies.
Buy “BioBuilder: Synthetic Biology in the Lab,” by Natalie Kuldell PhD., Rachel Bernstein, Karen Ingram, and Kathryn M. Hart.
What needs to happen for the revolution in biology and the life sciences to succeed? What are the preconditions?
I’ve compared the biorevolution to the computing revolution several times. One of the most important changes was that computers moved out of the lab, out of the machine room, out of that sacred space with raised floors, special air conditioning, and exotic fire extinguishers, into the home. Computers stopped being things that were cared for by an army of priests in white lab coats (and that broke several times a day), and started being things that people used. Somewhere along the line, software developers stopped being people with special training and advanced degrees; children, students, non-professionals — all sorts of people — started writing code. And enjoying it.
Biology is now in a similar place. But to take the next step, we have to look more carefully at what’s needed for biology to come out of the lab. Read more…
The O'Reilly Radar Podcast: Pilgrim Beart on the scale, challenges, and opportunities of the IoT.
Subscribe to the O’Reilly Radar Podcast to track the technologies and people that will shape our world in the years to come.
In this week’s Radar Podcast, O’Reilly’s Mary Treseler chatted with Pilgrim Beart about co-founding his company, AlertMe, and about why the scale of the Internet of Things creates as many challenges as it does opportunities. He also talked about the “gnarly problems” emerging from consumer wants and behaviors.
Insight and analysis on the Internet of Things and the new hardware movement.
Practitioners, entrepreneurs, academics, and analysts came together in San Francisco this week to discuss the Internet of Things and the new hardware movement at the O’Reilly 2015 Solid Conference. Below we’ve assembled notable keynotes and interviews from the event.
Lock in, lock out: DRM in the real world
Author and activist Cory Doctorow uses his Solid keynote to passionately explain how computers are already entwined in our lives and our bodies, which means laws that support lock-in are much more than inconveniences. Doctorow also discusses Apollo 1201, a project from the Electronic Frontier Foundation that aims to eradicate digital rights management (DRM).
How we make cars is a bigger environmental issue than how we fuel them.
Around two billion cars have been built over the last 115 years; twice that number will be built over the next 35-40 years. The environmental and health impacts will be enormous. Some think the solution is electric cars or other low- or zero-emission vehicles. The truth is, if you look at the emissions of a car over its total life, you quickly discover that tailpipe emissions are just the tip of the iceberg.
An 85 kWh electric SUV may not have a tailpipe, but it has an enormous impact on our environment and health. A far greater percentage of a car’s total emissions come from the materials and energy required for manufacturing a car (mining, processing, manufacturing, and disposal of the car ), not the car’s operation. As leading environmental economist and vice chair of the National Academy of Sciences Maureen Cropper notes, “Whether we are talking about a conventional gasoline-powered automobile, an electric vehicle, or a hybrid, most of the damages are actually coming from stages other than just the driving of the vehicle.” If business continues as usual, we could triple the total global pollution generated by automobiles, as we go from two billion to six billion vehicles manufactured.
The conclusion from this is straightforward: how we make our cars is actually a bigger environmental issue than how we fuel our cars. We need to dematerialize — dramatically reduce the material and energy required to build cars — and we need to do it now. Read more…