Github for Data, Open Laptop, Crowdsourced Analysis, and Open Source Scraping
- dat — github-like tool for data, still v. early. It’s overdue. (via Nelson Minar)
- Novena Open Laptop — Bunnie Huang’s laptop goes on sale.
- Crowd Forecasting (NPR) — How is it possible that a group of average citizens doing Google searches in their suburban town homes can outpredict members of the United States intelligence community with access to classified information?
- Portia — open source visual web scraping tool.
Fault-Tolerant Resilient Yadda Yadda, Tour Tips, Punch Cards, and Public Credit
- Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing (PDF) — Berkeley research paper behind Apache Spark. (via Nelson Minar)
- Angular Tour — trivially add tour tips (“This is the widget basket, drag and drop for widget goodness!” type of thing) to your Angular app.
- Punchcard — generate Github-style punch card charts “with ease”.
- Where Credit Belongs for Hack (Bryan O’Sullivan) — public credit for individual contributors in a piece of corporate open source is a sign of confidence in your team, that building their public reputation isn’t going to result in them leaving for one of the many job offers they’ll receive. And, of course, of caring for your individual contributors. Kudos Facebook.
Unimaginative Vehicular Connectivity, Data Journalism, VR and Gender, and Open Data Justice
- Connected for a Purpose (Jim Stogdill) — At a recent conference, an executive at a major auto manufacturer described his company’s efforts to digitize their line-up like this: “We’re basically wrapping a two-ton car around an iPad. Eloquent critique of the Internet of Shallow Things.
- Why Nate Silver Can’t Explain It All — Data extrapolation is a very impressive trick when performed with skill and grace, like ice sculpting or analytical philosophy, but it doesn’t come equipped with the humility we should demand from our writers. Would be a shame for Nate Silver to become Malcolm Gladwell: nice stories but they don’t really hold up.
- Gender and VR (danah boyd) — Although there was variability across the board, biological men were significantly more likely to prioritize motion parallax. Biological women relied more heavily on shape-from-shading. In other words, men are more likely to use the cues that 3D virtual reality systems relied on. Great article, especially notable for there are more sex hormones on the retina than in anywhere else in the body except for the gonads.
- Even The Innocent Should Worry About Sex Offender Apps (Quartz) — And when data becomes compressed by third parties, when it gets flattened out into one single data stream, your present and your past collide with potentially huge ramifications for your future. When it comes to personal data—of any kind—we not only need to consider what it will be used for but how that data will be represented, and what such representation might mean for us and others. Data policies are like justice systems: either you suffer a few innocent people being wrongly condemned (bad uses of open data0, or your system permits some wrongdoers to escape (mould grows in the dark).
Game Patterns, What Next, GPU vs CPU, and Privacy with Sensors
- Game Programming Patterns — a book in progress.
- Search for the Next Platform (Fred Wilson) — Mobile is now the last thing. And all of these big tech companies are looking for the next thing to make sure they don’t miss it.. And they will pay real money (to you and me) for a call option on the next thing.
- Debunking the 100X GPU vs. CPU Myth — in Pete Warden’s words, “in a lot of real applications any speed gains on the computation side are swamped by the time it takes to transfer data to and from the graphics card.”
- Privacy in Sensor-Driven Human Data Collection (PDF) — see especially the section “Attacks Against Privacy”. More generally, it is often the case the data released by researches is not the source of privacy issues, but the unexpected inferences that can be drawn from it. (via Pete Warden)
- Mining the World’s Data by Selling Street Lights and Farm Drones (Quartz) — Depending on what kinds of sensors the light’s owners choose to install, Sensity’s fixtures can track everything from how much power the lights themselves are consuming to movement under the post, ambient light, and temperature. More sophisticated sensors can measure pollution levels, radiation, and particulate matter (for air quality levels). The fixtures can also support sound or video recording. Bring these lights onto city streets and you could isolate the precise location of a gunshot within seconds.
- An Investor’s Guide to Hardware Startups — good to know if you’re thinking of joining one, too.
- WebScaleSQL — a MySQL downstream patchset built for “large scale” (aka Google, Facebook type loads).
Understanding Image Processing, Sharing Data, Fixing Bad Science, and Delightful Dashboard
- 2D Image Post-Processing Techniques and Algorithms (DIY Drones) — understanding how automated image matching and processing tools work means you can also get a better understanding how to shoot your images and what to prevent to get good matches.
- Scientists Need to Learn to Share — despite science’s reputation for rigor, sloppiness is a substantial problem in some fields. You’re much more likely to check your work and follow best data-handling practices when you know someone is going to run your code and parse your data.
- METRICS — Meta-Research Innovation Center at Stanford. John Ioannidis has a posse: connecting researchers into weak science, running conferences, creating a “journal watch”, and engaging policy makers. (says The Economist)
- Grafana — elegant dashboard for graphite (the realtime data graphing engine).
- brick — uncompressed versions of popular web fonts. The difference between compressed and uncompressed is noticeable.
- micio.js — clever hack to communicate between Arduino and mobile phones via the microphone jack.
- Exponentially Weighted Moving Averages for Go — Go implementation of algorithm useful for dealing with streams of data.
Super Gamers, Game Developers, Erlang+LLVM, and Git Visualised
- Meet the Super-Taskers (Psychology Today) — As part of the Nissan GT Academy challenge, the top 10 players of the car-racing game Gran Turismo are given the chance to race real automobiles in competition. They’re very good—too good, in fact. A graduate racing a real car in the British GT in 2012 was so fast that he could keep up with the professionals in what was supposed to be an amateur event. In 2013, GT Academy graduates were banned from such races in the UK. Instead, they have to compete against the pros.
- A View of Game Developers From The Future (Ian Bogost) — A new arms race commenced—for virtual attention, which the Patrons converted into financial instrument. While historians agree that ancient works like Civilization and chess still provided inspiration, games primarily became a specialized form of banking. As long as there has been advertising, there has been an attention economy: you advertise where people pay attention—whether it’s on the walls of buildings or above urinals.
- ErLLVM — providing multiple back ends for the High Performance Erlang (HiPE) with the use of the LLVM infastructure. Making the very-lightweight-multithreading Erlang less of a closed world fruitcake deployment can only be good.
- Explain Git with D3 (GitHub) — visualisations of common git operations.
Google Flu, Embeddable JS, Data Analysis, and Belief in the Browser
- The Parable of Google Flu (PDF) — We explore two
issues that contributed to [Google Flu Trends]’s mistakes—big data hubris and algorithm dynamics—and offer lessons for moving forward in the big data age. Overtrained and underfed?
- Principles of Good Data Analysis (Greg Reda) — Once you’ve settled on your approach and data sources, you need to make sure you understand how the data was generated or captured, especially if you are using your own company’s data. Treble so if you are using data you snaffled off the net, riddled with collection bias and untold omissions. (via Stijn Debrouwere)
PHP++, Planning, BitCoin, and Concurrency
- Hack — PHP with types, generics, collections, lambdas. From Facebook.
- Solve Hard Things Early — Build great habits around communication and decision-making when everyone still knows each other well.
- Marginally Useful (Paul Ford) — The last two decades have suggested a post-scarcity economy, where infinite copies of attractive digital things have a price approaching $0. Maybe that was merely a passing moment that we will look back upon with wonder once limited coins enforce scarcity—once the owner of a piece of digital art can look upon it with satisfaction and know with total, cryptographic certainty that because he paid for it, it belongs to him and no one else.
- Go Pipelines and Cancellation — Go’s fascinating me, as an example of a language designed for concurrency and syntactic familiarity.