- EtherCalc — open source web-based spreadsheet.
- Dynamics of Correlated Novelties (Nature) — paper on “the adjacent possible”. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological, or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya’s urn, predicts statistical laws for the rate at which novelties happen (Heaps’ law) and for the probability distribution on the space explored (Zipf’s law), as well as signatures of the process by which one novelty sets the stage for another. (via Steven Strogatz)
- On The Media Interview with OKCupid CEO — relevant to the debate on ethics of A/B tests. I preferred this to Tim Carmody’s rant.
- CRDTs as Alternative to APIs — when using CRDTs to tie your system together, you don’t need to resort to using impoverished representations that simply never come anywhere near the representational power of the data structures you use in your programs at runtime. See also this paper on Convergent and Commutative Replicated Data Types.
Lessons from the design community for developing data-driven applications
When you hear someone say, “that is a nice infographic” or “check out this sweet dashboard,” many people infer that they are “well-designed.” Creating accessible (or for the cynical, “pretty”) content is only part of what makes good design powerful. The design process is geared toward solving specific problems. This process has been formalized in many ways (e.g., IDEO’s Human Centered Design, Marc Hassenzahl’s User Experience Design, or Braden Kowitz’s Story-Centered Design), but the basic idea is that you have to explore the breadth of the possible before you can isolate truly innovative ideas. We, at Datascope Analytics, argue that the same is true of designing effective data science tools, dashboards, engines, etc — in order to design effective dashboards, you must know what is possible.
Zombie Drones, Algebra Through Code, Data Toolkit, and Crowdsourcing Antibiotic Discovery
- Skyjack — drone that takes over other drones. Welcome to the Malware of Things.
- Bootstrap World — a curricular module for students ages 12-16, which teaches algebraic and geometric concepts through computer programming. (via Esther Wojicki)
- Harvest — open source BSD-licensed toolkit for building web applications for integrating, discovering, and reporting data. Designed for biomedical data first. (via Mozilla Science Lab)
- Project ILIAD — crowdsourced antibiotic discovery.
Coding for Unreliability, AirBnB JS Style, Category Theory, and Text Processing
- Quantitative Reliability of Programs That Execute on Unreliable Hardware (MIT) — As MIT’s press release put it: Rely simply steps through the intermediate representation, folding the probability that each instruction will yield the right answer into an estimation of the overall variability of the program’s output. (via Pete Warden)
- Category Theory for Scientists (MIT Courseware) — Scooby snacks for rationalists.
- Textblob — Python open source text processing library with sentiment analysis, PoS tagging, term extraction, and more.