- Hardening Android for Security and Privacy — a brilliant project! prototype of a secure, full-featured, Android telecommunications device with full Tor support, individual application firewalling, true cell network baseband isolation, and optional ZRTP encrypted voice and video support. ZRTP does run over UDP which is not yet possible to send over Tor, but we are able to send SIP account login and call setup over Tor independently.
- The Great Smartphone War (Vanity Fair) — “I represented [the Swedish telecommunications company] Ericsson, and they couldn’t lie if their lives depended on it, and I represented Samsung and they couldn’t tell the truth if their lives depended on it.” That’s the most catching quote, but interesting to see Samsung’s patent strategy described as copying others, delaying the lawsuits, settling before judgement, and in the meanwhile ramping up their own innovation. Perhaps the other glory part is the description of Samsung employee shredding and eating incriminating documents while stalling lawyers out front. An excellent read.
- socketcluster — highly scalable realtime WebSockets based on Engine.io. They have screenshots of 100k messages/second on an 8-core EC2 m3.2xlarge instance.
- Machine Learning on a Board — everything good becomes hardware, whether in GPUs or specialist CPUs. This one has a “Machine Learning Co-Processor”. Interesting idea, to package up inputs and outputs with specialist CPU, but I wonder whether it’s a solution in search of a problem. (via Pete Warden)
Time Series Database, Cluster Schedulers, Structural Search-and-Replace, and TV Data
- Influx DB — open-source, distributed, time series, events, and metrics database with no external dependencies.
- Omega (PDF) — ﬂexible, scalable schedulers for large compute clusters. From Google Research.
- Amazon Mines Its Data Trove To Bet on TV’s Next Hit (WSJ) — Amazon produced about 20 pages of data detailing, among other things, how much a pilot was viewed, how many users gave it a 5-star rating and how many shared it with friends.
Amen Break, MySQL Scale, Spooky Source, and Graph Analytics Engine
- The Amen Break (YouTube) — fascinating 20m history of the amen break, a handful of bars of drum solo from a forgotten 1969 song which became the origin of a huge amount of popular music from rap to jungle and commercials, and the contested materials at the heart of sample-based music. Remix it and weep. (via Beta Knowledge)
- The MySQL Ecosystem at Scale (PDF) — nice summary of how MySQL is used on massive users, and where the sweet spots have been found.
- Lab41 (Github) — open sourced code from a spook hacklab in Silicon Valley.
- Fanulus — open sourced Hadoop-based graph analytics engine for analyzing graphs represented across a multi-machine compute cluster. A breadth-first version of the graph traversal language Gremlin operates on graphs stored in the distributed graph database Titan, in any Rexster-fronted graph database, or in HDFS via various text and binary formats.
Audio Visualization, 3D Printed Toys, Data Center Computing, and Downloding Not Yet Beaten
- github realtime activity — audio triggered by github activity, built with choir.io.
- Makies Hit Shelves at Selfridges — 3d printing business gaining mainstream distribution. Win!
- The Datacenter as Computer — we must treat the datacenter itself as one massive warehouse-scale computer (WSC). We describe the architecture of WSCs, the main factors influencing their design, operation, and cost structure, and the characteristics of their software base. We hope it will be useful to architects and programmers of today’s WSCs, as well as those of future many-core platforms which may one day implement the equivalent of today’s WSCs on a single board. (via Mike Loukides)
- Illegal Downloads Not Erased By Simultaneous Release — Data gathered by TorrentFreak throughout the day reveals that most early downloaders, a massive 16.1%, come from Australia. Down Under the show aired on the pay TV network Foxtel, but it appears that many Aussies prefer to download a copy instead. The same is true for the United States and Canada, with 16% and 9.6% of the total downloads respectively, despite the legal offerings. Unclear whether this represents greater or less downloading than would have happened without simultaneous release.
DEFCON Doco, Global-Scale Networks, Media Goblin, and TCP/IP Legos
- DEFCON Documentary — free download, I’m looking forward to watching it on the flight back to NZ.
- Global-Scale Systems — botnets as example of the scale of networks and systems we’ll have to build but don’t have experience in.
- MediaGoblin — GNU project to build a decentralized alternative to Flickr, YouTube, SoundCloud, etc.
- Teaching TCP/IP Headers with Legos — genius. (via BoingBoing)
A new look at Yahoo's traffic, the challenge of scaling Tumblr, and a host of visualization guidelines.
In this week's data news: Yahoo visualizes its front page traffic and demographics, why Tumblr is tougher to scale than Twitter, and a look at what you need to consider as you build visualizations.