ENTRIES TAGGED "data science"

A different take on data skepticism

Our tools should make common cases easy and safe, but that's not the reality today.

Recently, the Mathbabe (aka Cathy O’Neil) vented some frustration about the pitfalls in applying even simple machine learning (ML) methods like k-nearest neighbors. As data science is democratized, she worries that naive practitioners will shoot themselves in the foot because these tools can offer very misleading results. Maybe data science is best left to the pros? Mike…
Read Full Post | Comment: 1 |

Data skepticism

If data scientists aren't skeptical about how they use and analyze data, who will be?

A couple of months ago, I wrote that “big data” is heading toward the trough of a hype curve as a result of oversized hype and promises. That’s certainly true. I see more expressions of skepticism about the value of data every day. Some of the skepticism is a reaction against the hype; a lot of it arises…
Read Full Post | Comments: 5 |

Hacking robotic arms, predicting flight arrival times, manufacturing in America, tracking Disney customers (industrial Internet links)

The next wave of manufacturing will be highly automated--and American. Also, a hardware hacking collective rehabilitated a pair of cast-off industrial robots.

Flight Quest (GE, powered by Kaggle) — Last November GE, Alaska Airlines, and Kaggle announced the Flight Quest competition, which invites data scientists to build models that can accurately predict when a commercial airline flight touches down and reaches its gate. Since the leaderboard for the competition was activated on December 18, 2012, entrants have already beaten the…
Read Full Post | Comment |

Need speed for big data? Think in-memory data management

We're launching an investigation into in-memory data technologies.

By Ben Lorica and Roger Magoulas In a forthcoming report we will highlight technologies and solutions that take advantage of the decline in prices of RAM, the popularity of distributed and cloud computing systems, and the need for faster queries on large, distributed data stores. Established technology companies have had interesting offerings, but what initially caught our attention…
Read Full Post | Comments: 14 |
Four short links: 1 January 2013

Four short links: 1 January 2013

Silicon Beats Meat, Workers against Machines, Quora Design Notes, and Free Data Science Books

  1. Robots Will Take Our Jobs (Wired) — I agree with Kevin Kelly that (in my words) software and hardware are eating wetware, but disagree that This is not a race against the machines. If we race against them, we lose. This is a race with the machines. You’ll be paid in the future based on how well you work with robots. Ninety percent of your coworkers will be unseen machines. Most of what you do will not be possible without them. And there will be a blurry line between what you do and what they do. You might no longer think of it as a job, at least at first, because anything that seems like drudgery will be done by robots. Civilizations which depend on specialization reward work and penalize idleness. We already have more people than work for them, and if we’re not to be creating a vast disconnected former workforce then we (society) need to get a hell of a lot better at creating jobs and not destroying them.
  2. Why Workers are Losing the War Against Machines (The Atlantic) — There is no economic law that says that everyone, or even most people, automatically benefit from technological progress.
  3. Early Quora Design Notes — I love reading post-mortems and learning from what other people did. Picking a starting point is important because it will be the axis the rest of the design revolves around — but it’s tricky and not always the first page in the flow. Ideally, you should start with the page that serves the most significant goals of the product.
  4. Free Data Science BooksI don’t mean free as in some guy paid for a PDF version of an O’Reilly book and then posted it online for others to use/steal, but I mean genuine published books with a free online version sanctioned by the publisher. That is, “the publisher has graciously agreed to allow a full, free version of my book to be available on this site.” (via Stein Debrouwere)
Comment |

New data competition tackles airline delays

Airlines face a very costly data problem. A new competition looks to crack it.

The scenario is familiar: a flight leaves the gate in New York on time, sits in…
Read Full Post | Comment: 1 |

Solving the Wanamaker problem for health care

Data science and technology give us the tools to revolutionize health care. Now we have to put them to use.

By Tim O’Reilly, Julie Steele, Mike Loukides and Colin Hill “The best minds of my generation are thinking about how to make people click ads.” — Jeff Hammerbacher, early Facebook employee “Work on stuff that matters.” — Tim O’Reilly In the early days of the 20th century, department…
Read Full Post | Comments: 12 |

A grisly job for data scientists

Matching the missing to the dead involves reconciling two national databases.

Javier Reveron went missing from Ohio in 2004. His wallet turned up in New York City, but he was nowhere to be found. By the time his parents arrived…
Read Full Post | Comments: 2 |
StrataRx: Data science and health(care)

StrataRx: Data science and health(care)

A call for data scientists, technologists, health professionals, and business leaders to convene.

By Mike Loukides and Jim Stogdill We are launching a conference at the intersection of health, health care, and data. Why? Our health care system is in crisis. We are experiencing epidemic levels of obesity, diabetes, and other preventable conditions while at the same time…
Read Full Post | Comment: 1 |
Data Jujitsu: The art of turning data into product

Data Jujitsu: The art of turning data into product

Smart data scientists can make big problems small.

Having worked in academia, government and industry, I’ve had a unique opportunity to build products in each sector. Much of this product development has been around building data products. Just as methods for general product development have steadily improved, so have the ideas for developing data products. Thanks to large investments in the general area of…
Read Full Post | Comments: 3 |