- Pervasive Monitoring is an Attack (Tim Bray) — if your application doesn’t support privacy, that’s probably a bug in your application.
- Reconciling Mozilla’s Mission and the W3C EME — essentially, “we don’t want to put a closed source bolus of evil into our open source unicorn, but you won’t be able to watch House of Cards with Firefox if we don’t.”
- The Financial Future of Game Developers (Raph Koster) — Today, a console is really just a hardware front end to a digital publisher/distribution network/storefront. […] Any structure that depends solely on blockbusters is not long for this world, because there is a significant component of luck in what drives popularity, so every release is literally a gamble. […] The median game uploaded to the App Store makes zero dollars. It starts great and just gets better. Koster is on fire! He scores again! GOOOOOOOOOOOOOOOAL!
- Notes on Distributed Systems for Young Bloods — “It’s slow” is the hardest problem you’ll ever debug.
Time Series Database, Cluster Schedulers, Structural Search-and-Replace, and TV Data
- Influx DB — open-source, distributed, time series, events, and metrics database with no external dependencies.
- Omega (PDF) — ﬂexible, scalable schedulers for large compute clusters. From Google Research.
- Amazon Mines Its Data Trove To Bet on TV’s Next Hit (WSJ) — Amazon produced about 20 pages of data detailing, among other things, how much a pilot was viewed, how many users gave it a 5-star rating and how many shared it with friends.
Amen Break, MySQL Scale, Spooky Source, and Graph Analytics Engine
- The Amen Break (YouTube) — fascinating 20m history of the amen break, a handful of bars of drum solo from a forgotten 1969 song which became the origin of a huge amount of popular music from rap to jungle and commercials, and the contested materials at the heart of sample-based music. Remix it and weep. (via Beta Knowledge)
- The MySQL Ecosystem at Scale (PDF) — nice summary of how MySQL is used on massive users, and where the sweet spots have been found.
- Lab41 (Github) — open sourced code from a spook hacklab in Silicon Valley.
- Fanulus — open sourced Hadoop-based graph analytics engine for analyzing graphs represented across a multi-machine compute cluster. A breadth-first version of the graph traversal language Gremlin operates on graphs stored in the distributed graph database Titan, in any Rexster-fronted graph database, or in HDFS via various text and binary formats.
Audio Visualization, 3D Printed Toys, Data Center Computing, and Downloding Not Yet Beaten
- github realtime activity — audio triggered by github activity, built with choir.io.
- Makies Hit Shelves at Selfridges — 3d printing business gaining mainstream distribution. Win!
- The Datacenter as Computer — we must treat the datacenter itself as one massive warehouse-scale computer (WSC). We describe the architecture of WSCs, the main factors influencing their design, operation, and cost structure, and the characteristics of their software base. We hope it will be useful to architects and programmers of today’s WSCs, as well as those of future many-core platforms which may one day implement the equivalent of today’s WSCs on a single board. (via Mike Loukides)
- Illegal Downloads Not Erased By Simultaneous Release — Data gathered by TorrentFreak throughout the day reveals that most early downloaders, a massive 16.1%, come from Australia. Down Under the show aired on the pay TV network Foxtel, but it appears that many Aussies prefer to download a copy instead. The same is true for the United States and Canada, with 16% and 9.6% of the total downloads respectively, despite the legal offerings. Unclear whether this represents greater or less downloading than would have happened without simultaneous release.
DEFCON Doco, Global-Scale Networks, Media Goblin, and TCP/IP Legos
- DEFCON Documentary — free download, I’m looking forward to watching it on the flight back to NZ.
- Global-Scale Systems — botnets as example of the scale of networks and systems we’ll have to build but don’t have experience in.
- MediaGoblin — GNU project to build a decentralized alternative to Flickr, YouTube, SoundCloud, etc.
- Teaching TCP/IP Headers with Legos — genius. (via BoingBoing)
A new look at Yahoo's traffic, the challenge of scaling Tumblr, and a host of visualization guidelines.
In this week's data news: Yahoo visualizes its front page traffic and demographics, why Tumblr is tougher to scale than Twitter, and a look at what you need to consider as you build visualizations.