Nat has chaired the O'Reilly Open Source Convention and other O'Reilly conferences for over a decade. He ran the first web server in New Zealand, co-wrote the best-selling Perl Cookbook, and was one of the founding Radar bloggers. He lives in New Zealand and consults in the Asia-Pacific region.
The Dark Market for Personal Data (NYTimes) — can buy lists of victims of sexual assault, of impulse buyers, of people with sexually transmitted disease, etc. The cost of a false-positive when those lists are used for marketing is less than the cost of false-positive when banks use the lists to decide whether you’re a credit risk. The lists fall between the cracks in privacy legislation; essentially, the compilation and use of lists of people are unregulated territory.
Collaborative Filtering at LinkedIn (PDF) — This paper presents LinkedIn’s horizontal collaborative filtering infrastructure, known as browsemaps. Great lessons learned, including context and presentation of browsemaps or any recommendation is paramount for a truly relevant user experience. That is, design and presentation represents the largest ROI, with data engineering being a second, and algorithms last. (via Greg Linden)
Solar Hits Parity in 10 States, 47 by 2016 (Bloomberg) — The reason solar-power generation will increasingly dominate: it’s a technology, not a fuel. As such, efficiency increases and prices fall as time goes on. The price of Earth’s limited fossil fuels tends to go the other direction.
Facebook’s Top Open Data Problems (Facebook Research) — even if you’re not interested in Facebook’s Very First World Problems, this is full of factoids like Facebook’s social graph store TAO, for example, provides access to tens of petabytes of data, but answers most queries by checking a single page in a single machine. (via Greg Linden)
LittleBits Adds Functionality (MakeZine) — That next big idea might come from one of the latest bits in the littleBits catalog, the cloudBit. The piece enables wi-fi control of your circuit in various configurations — from the Internet to the bit, from the bit to the internet, or from bit to bit.
Big Data’s Big Ideas (Ben Lorica) — this is a lot of what’s on the O’Reilly radar at the moment. Excellent short summary, with links.
Rodney Brooks and Robotics (Boston Magazine) — [The robot] Baxter’s LCD eyes will look at the spot where it’s about to reach, making its movements, from a human perspective, more predictable. “If you want a machine to be able to interact with people,” Brooks says, “it better not do things that are surprising to people.”
FUZIX — new open source OS from Alan Cox. Runs on Z80s, mostly runs on 6502s, and in theory if it’s got 8 bits and banked RAM you can probably run Fuzix OS on it. (via Alan Cox)
A Critique of the Balancing Metaphor in Privacy and Security — The arguments presented by this paper are built on two underlying assertions. The first is that the assessment of surveillance measures often entails a judgement of whether any loss in privacy is legitimised by a justifiable increase in security. However, one fundamental difference between privacy and security is that privacy has two attainable end-states (absolute privacy through to the absolute absence of privacy), whereas security has only one attainable end-state (while the absolute absence of security is attainable, absolute security is a desired yet unobtainable goal). The second assertion, which builds upon the first, holds that because absolute security is desirable, new security interventions will continuously be developed, each potentially trading a small measure of privacy for a small rise in security. When assessed individually each intervention may constitute a justifiable trade-off. However, when combined together, these interventions will ultimately reduce privacy to zero. (via Alistair Croll)
ISP Interconnection and its Impact on Consumer Internet Performance (Measurement Lab) — In researching our report, we found clear evidence that interconnection between major U.S. access ISPs (AT&T, Comcast, CenturyLink, Time Warner Cable, and Verizon) and transit ISPs Cogent, Level 3, and potentially XO was correlated directly with degraded consumer performance throughout 2013 and into 2014 (in some cases, ongoing as of publication). Degraded performance was most pronounced during peak use hours, which points to insufficient capacity and congestion as a causal factor. Further, by noting patterns of performance degradation for access/transit ISP pairs that were synchronized across locations, we were able to conclude that in many cases degradation was not the result of major infrastructure failures at any specific point in a network, but rather connected with the business relationships between ISPs.
TweetNLP — CMU open source natural language parsing tools for making sense of Tweets.
Interview with Google X Life Science’s Head (Medium) — I will have been here two years this March. In nineteen months we have been able to hire more than a hundred scientists to work on this. We’ve been able to build customized labs and get the equipment to make nanoparticles and decorate them and functionalize them. We’ve been able to strike up collaborations with MIT and Stanford and Duke. We’ve been able to initiate protocols and partnerships with companies like Novartis. We’ve been able to initiate trials like the baseline trial. This would be a good decade somewhere else. The power of focus and money.
Schooloscope Open Data Post-Mortem — The case of Schooloscope and the wider question of public access to school data challenges the belief that sunlight is the best disinfectant, that government transparency would always lead to better government, better results. It challenges the sentiments that see data as value-neutral and its representation as devoid of politics. In fact, access to school data exposes a sharp contrast between the private interest of the family (best education for my child) and the public interest of the government (best education for all citizens).
M-Lab Observatory — explorable data on the data experience (RTT, upload speed, etc) across different ISPs in different geographies over time.
Build Quality In — an e-book collection of Continuous Delivery and DevOps experience reports from the wild. Work in progress, and a collection of accumulated experience in the new software engineering practices can’t be a bad thing.
Designing for Large-Screen Cellphones (Luke Wroblewski) — In his analysis of 1,333 observations of smartphones in use, Steven Hoober found about 75% of people rely on their thumb and 49% rely on a one-handed grip to get things done on their phones. On large screens (over four inches) those kinds of behaviors can stretch people’s thumbs well past their comfort zone as they try to reach controls positioned at the top of their device. Design advice to create interactions that don’t strain tendons or gray matter.
fastsocket (Github) — a highly scalable socket and its underlying networking implementation of Linux kernel. With the straight linear scalability, Fastsocket can provide extremely good performance in multicore machines.
Content Moderation (Wired) — “content moderators” are the people paid to weed out beheadings, pornography, etc. from photo and video sites. By at least one estimate, the number of content moderators scrubbing the world’s social media sites, mobile apps, and cloud storage services runs to “well over 100,000”—that is, about twice the total head count of Google and nearly 14 times that of Facebook.
PaGMO — Parallel Global Multiobjective Optimizer […] a generalization of the island model paradigm working for global and local optimization algorithms. Its main parallelization approach makes use of multiple threads, but MPI is also implemented and can be mixed in with multithreading. PaGMO can be used to solve in a parallel fashion, global optimization tasks.
Avoiding the Tragedy of the Anticommons — Many people talk about “open source biology.” Mike Loukides pulls apart open source and biology to see what the relationship might be. I’m still chewing on what devops for bio would be. Modern software systems throw off gigabytes of data, and we have built tools to monitor those systems, archive their data, and automate much of the analysis. There are free and commercial packages for logging and monitoring, and it continues to be a very active area of software development, as anyone who’s attended O’Reilly’s Velocity conference knows.
peppytides (Makezine) — 3d-printed super accurate, scaled 3D-model of a polypeptide chain that can be folded into all the basic protein structures, like α-helices, β-sheets, and β-turns. (via Lenore Edman)
London Data Store — dashboard and open data catalogue for City of London’s data release efforts.