ENTRIES TAGGED "data"

Four short links: 16 December 2013

Four short links: 16 December 2013

Data Pipeline, Data Driven Education, Crowdsourced Proofreading, and 3D Printed Shoes

  1. Suro (Github) — Netflix data pipeline service for large volumes of event data. (via Ben Lorica)
  2. NIPS Workshop on Data Driven Education — lots of research papers around machine learning, MOOC data, etc.
  3. Proofist — crowdsourced proofreading game.
  4. 3D-Printed Shoes (YouTube) — LeWeb talk from founder of the company, Continuum Fashion). (via Brady Forrest)
Comment
Four short links: 12 December 2013

Four short links: 12 December 2013

Bluetooth LE, Keyboard Design, Dataset API, and State Machines

  1. iBeacons — Bluetooth LE enabling tighter coupling of physical world with digital. I’m enamoured with the interaction possibilities: The latest Apple TV software brought a fantastically clever workaround. You just tap your iPhone to the Apple TV itself, and it passes your Wi-Fi and iTunes credentials over and sets everything up instantaneously.
  2. Better and Better Keyboards (Jesse Vincent) — It suffered from the same problem as every other 3D-printed keyboard I’d made to date – When I showed it to someone, they got really excited about the fact that I had a 3D printer. In contrast, whenever I showed someone one of the layered acrylic prototype keyboards I’d built, they got excited about the keyboard.
  3. Bamboo.io — open source modular web service for dataset storage and retrieval.
  4. state.jsOpen source JavaScript state machine supporting most UML 2 features.
Comment

Design, Math, and Data

Lessons from the design community for developing data-driven applications

By Dean Malmgren

When you hear someone say, “that is a nice infographic” or “check out this sweet dashboard,” many people infer that they are “well-designed.” Creating accessible (or for the cynical, “pretty”) content is only part of what makes good design powerful. The design process is geared toward solving specific problems. This process has been formalized in many ways (e.g., IDEO’s Human Centered Design, Marc Hassenzahl’s User Experience Design, or Braden Kowitz’s Story-Centered Design), but the basic idea is that you have to explore the breadth of the possible before you can isolate truly innovative ideas. We, at Datascope Analytics, argue that the same is true of designing effective data science tools, dashboards, engines, etc — in order to design effective dashboards, you must know what is possible.

Read more…

Comment
Four short links: 4 December 2013

Four short links: 4 December 2013

Zombie Drones, Algebra Through Code, Data Toolkit, and Crowdsourcing Antibiotic Discovery

  1. Skyjack — drone that takes over other drones. Welcome to the Malware of Things.
  2. Bootstrap Worlda curricular module for students ages 12-16, which teaches algebraic and geometric concepts through computer programming. (via Esther Wojicki)
  3. Harvestopen source BSD-licensed toolkit for building web applications for integrating, discovering, and reporting data. Designed for biomedical data first. (via Mozilla Science Lab)
  4. Project ILIAD — crowdsourced antibiotic discovery.
Comment: 1
Four short links: 28 November 2013

Four short links: 28 November 2013

Data Tool, Arduino-like Board, Learn to Code via Videogames, and Creative Commons 4.0 Out

  1. OpenRefine — (edited: 7 Dec 2013) Google abandoned Google bought Freebase’s GridWorks, turned it into the excellent Refine tool for working with data sets, now picked up and developed by open source community.
  2. Intel’s Arduino-Compatible Board — launched at MakerFaire Rome. (via Wired UK)
  3. Game Maven — learn to code by writing casual videogames. (via Greg Linden)
  4. CC 4.0 OutThe 4.0 licenses are extremely well-suited for use by governments and publishers of public sector information and other data, especially for those in the European Union. This is due to the expansion in license scope, which now covers sui generis database rights that exist there and in a handful of other countries.
Comment: 1

23andMe flap at FDA indicates fundamental dilemma in health reform

We must go beyond hype for incentives to provide data to researchers

The FDA order stopping 23andM3 from offering its genetic test kit strikes right into the heart of the major issue in health care reform: the tension between individual care and collective benefit. Health is not an individual matter. As I will show, we need each other. And beyond narrow regulatory questions, the 23andMe issue opens up the whole goal of information sharing and the funding of health care reform.

Read more…

Comments: 2

Handling Data at a New Particle Accelerator

Unlocking Scientific Data with Python

dust_accelerator

Most people working on complex software systems have had That Moment, when you throw up your hands and say “If only we could start from scratch!” Generally, it’s not possible. But every now and then, the chance comes along to build a really exciting project from the ground up.

In 2011, I had the chance to participate in just such a project: the acquisition, archiving and database systems which power a brand-new hypervelocity dust accelerator at the University of Colorado.

Read more…

Comment
Four short links: 6 November 2013

Four short links: 6 November 2013

Warrant Canary, Polluted Statistics, Dollars for Deathbots, and Protocol Madness

  1. Apple Transparency Report (PDF) — contains a warrant canary, the statement Apple has never received an order under Section 215 of the USA Patriot Act. We would expect to challenge an order if served on us which will of course be removed if one of the secret orders is received. Bravo, Apple, for implementing a clever hack to route around excessive secrecy. (via Boing Boing)
  2. You’re Probably Polluting Your Statistics More Than You Think — it is insanely easy to find phantom correlations in random data without obviously being foolish. Anyone who thinks it’s possible to draw truthful conclusions from data analysis without really learning statistics needs to read this. (via Stijn Debrouwere)
  3. CyPhy Funded (Quartz) — the second act of iRobot co-founder Helen Greiner, maker of the famed Roomba robot vacuum cleaner. She terrified ETech long ago—the audience were expecting Roomba cuteness and got a keynote about military deathbots. It would appear she’s still in the deathbot niche, not so much with the cute. Remember this when you build your OpenCV-powered recoil-resistant load-bearing-hoverbot and think it’ll only ever be used for the intended purpose of launching fertiliser pellets into third world hemp farms.
  4. User-Agent String History — a light-hearted illustration of why the formal semantic value of free-text fields is driven to zero in the face of actual use.
Comments: 3
Four short links: 5 November 2013

Four short links: 5 November 2013

Time Series Database, Cluster Schedulers, Structural Search-and-Replace, and TV Data

  1. Influx DBopen-source, distributed, time series, events, and metrics database with no external dependencies.
  2. Omega (PDF) — flexible, scalable schedulers for large compute clusters. From Google Research.
  3. GraspJSSearch and replace your JavaScript code based on its structure rather than its text.
  4. Amazon Mines Its Data Trove To Bet on TV’s Next Hit (WSJ) — Amazon produced about 20 pages of data detailing, among other things, how much a pilot was viewed, how many users gave it a 5-star rating and how many shared it with friends.
Comment: 1
Four short links: 28 October 2013

Four short links: 28 October 2013

The Internot of Things, Explainy Learning, Medical Microcontroller Board, and Coder Sutra

  1. A Cyber Attack Against Israel Shut Down a RoadThe hackers targeted the Tunnels’ camera system which put the roadway into an immediate lockdown mode, shutting it down for twenty minutes. The next day the attackers managed to break in for even longer during the heavy morning rush hour, shutting the entire system for eight hours. Because all that is digital melts into code, and code is an unsolved problem.
  2. Random Decision Forests (PDF) — “Due to the nature of the algorithm, most Random Decision Forest implementations provide an extraordinary amount of information about the final state of the classifier and how it derived from the training data.” (via Greg Borenstein)
  3. BITalino — 149 Euro microcontroller board full of physiological sensors: muscles, skin conductivity, light, acceleration, and heartbeat. A platform for healthcare hardware hacking?
  4. How to Be a Programmer — a braindump from a guru.
Comment