- Google Dev Apologies After Photos App Tags Black People as Gorillas (Ars Technica) — this is how you recover from a unequivocally horrendous mistake.
- IRS Finally Agrees to Release Non-Profit Records (BoingBoing) — Today, the IRS released a statement saying they’re going to do what we’ve been hoping for, saying they are going to release e-file data and this is a “priority for the IRS.” Only took $217,000 in billable lawyer hours (pro bono, thank goodness) to get there.
- Time Series Database Requirements — classic paper, laying out why time-series databases are so damn weird. Their access patterns are so unique because of the way data is over-gathered and pushed ASAP to the store. It’s mostly recent, mostly never useful, and mostly needed in order. (via Thoughts on Time-Series Databases)
- Compiler Errors for Humans — it’s so important, and generally underbaked in languages. A decade or more ago, I was appalled by Python’s errors after Perl’s very useful messages. Today, appreciating Go’s generally handy errors. How a system handles the operational failures that will inevitably occur is part and parcel of its UX.
"open data" entries
Personal wellness data should be shared as freely as water and air.
Register for the free webcast, “Life Streams, Walled Gardens, and the Internet of Living Things.” Brigitte Piniewski and Hagen Finley will discuss the Internet of Living Things, what makes sensoring and monitoring data emanating from our bodies unique, and why we should elect to participate in this seemingly Orwellian mistake of open-sourcing our personal health data.
We are at a threshold in the history of personal data. Sensors and apps are making it possible to generate digital data signatures of important aspects of healthy living, such as movement, nutrition, and sleep. However, we are rapidly losing the opportunity to erect a Linux-like open “living-well” data system steeped in open commons principles. We can either join together to ensure enlightened open source and crowdsourced discovery practices become the norm for our living-well data footprints, or we can passively allow this data to be sequestered into one of the walled gardens offered by health systems, funded research, or big business.
Why this is important?
Living-well data provides the map by which vast amounts of preventable human suffering can be prevented. Everyone can benefit from the health journeys of those who lived before us because our modern societies are no longer “accidentally well.” Decades ago, parents had no need to question the nutrition a child was offered or concern themselves with how much activity a child engaged in. No deliberate use of devices was needed to track these important health contributors. Reasonable access to whole foods (farm foods) and reasonable amounts of activity were provided, as it were, by default — in other words, by accident. This resulted in remarkably low rates of chronic disease. Today, communities cannot take those healthy choices for granted — we are no longer accidentally well. Read more…
If you really want to understand the effect data is having, you need the models.
Writing my post about AI and summoning the demon led me to re-read a number of articles on Cathy O’Neil’s excellent mathbabe blog. I highlighted a point Cathy has made consistently: if you’re not careful, modelling has a nasty way of enshrining prejudice with a veneer of “science” and “math.”
Cathy has consistently made another point that’s a corollary of her argument about enshrining prejudice. At O’Reilly, we talk a lot about open data. But it’s not just the data that has to be open: it’s also the models. (There are too many must-read articles on Cathy’s blog to link to; you’ll have to find the rest on your own.)
You can have all the crime data you want, all the real estate data you want, all the student performance data you want, all the medical data you want, but if you don’t know what models are being used to generate results, you don’t have much. Read more…
Recreation.gov should be a platform, not a silo.
President Obama’s well-publicized national open data policy (pdf) makes it clear that government data is a valuable public resource for which the government should be making efforts to maximize access and use. This policy was based on lessons from previous government open data success stories, such as weather data and GPS, which form the basis for countless commercial services that we take for granted today and that deliver enormous value to society. (You can see an impressive list of companies reliant on open government data via GovLab’s Open Data 500 project.)
Based on this open data policy, I’ve been encouraging entrepreneurs to invest their time and ingenuity to explore entrepreneurial opportunities based on government data. I’ve even invested (through O’Reilly AlphaTech Ventures) in one such start-up, Hipcamp, which provides user-friendly interfaces to making reservations at national and state parks.
A better system is sorely needed. The current reservation system is clunky and difficult to use. Hipcamp changes all that, making it a breeze to reserve camping spots. Read more…