SAMOA — Yahoo!’s distributed streaming machine learning (ML) framework that contains a programming abstraction for distributed streaming ML algorithms. (via Introducing SAMOA)
madlib — an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data.
Data Portraits: Connecting People of Opposing Views — Yahoo! Labs research to break the filter bubble. Connect people who disagree on issue X (e.g., abortion) but who agree on issue Y (e.g., Latin American interventionism), and present the differences and similarities visually (they used wordclouds). Our results suggest that organic visualisation may revert the negative effects of providing potentially sensitive content. (via MIT Technology Review)
Living Light — 3D printed cephalopods filled with bioluminescent bacteria. PAGING CORY DOCTOROW, YOUR ORGASMATRON HAS ARRIVED. (via Sci Blogs)
Repacking Lego Batteries with a CNC Mill — check out the video. Patrick programmed a CNC machine to drill out the rivets holding the Mindstorms battery pack together. Coding away a repetitive task like this is gorgeous to see at every scale. We don’t have to teach our kids a particular programming language, but they should know how to automate cruft.
My Thoughts on Google+ (YouTube) — when your fans make hatey videos like this one protesting Google putting the pig of Google Plus onto the lipstick that was YouTube, you are Doin’ It Wrong.
The Amen Break (YouTube) — fascinating 20m history of the amen break, a handful of bars of drum solo from a forgotten 1969 song which became the origin of a huge amount of popular music from rap to jungle and commercials, and the contested materials at the heart of sample-based music. Remix it and weep. (via Beta Knowledge)
Lab41 (Github) — open sourced code from a spook hacklab in Silicon Valley.
Fanulus — open sourced Hadoop-based graph analytics engine for analyzing graphs represented across a multi-machine compute cluster. A breadth-first version of the graph traversal language Gremlin operates on graphs stored in the distributed graph database Titan, in any Rexster-fronted graph database, or in HDFS via various text and binary formats.
Researchers Can Slip an Undetectable Trojan into Intel’s Ivy Bridge CPUs (Ars Technica) — The exploit works by severely reducing the amount of entropy the RNG normally uses, from 128 bits to 32 bits. The hack is similar to stacking a deck of cards during a game of Bridge. Keys generated with an altered chip would be so predictable an adversary could guess them with little time or effort required. The severely weakened RNG isn’t detected by any of the “Built-In Self-Tests” required for the P800-90 and FIPS 140-2 compliance certifications mandated by the National Institute of Standards and Technology.
rethinkdb — open-source distributed JSON document database with a pleasant and powerful query language.
Teach Kids Programming — a collection of resources. I start on Scratch much sooner, and 12+ definitely need the Arduino, but generally I agree with the things I recognise, and have a few to research …
No Managers — If we could find a way to replace the function of the managers and focus everyone on actually producing for our Students (customers) then it would actually be possible to be a #NoManager company. In my future posts I’ll explain how we’re doing this at Treehouse.
The 20 Smartest Things Jeff Bezos Has Ever Said (Motley Fool) — I feel like the 219th smartest thing Jeff Bezos has ever said is still smarter than the smartest thing most business commentators will ever say. (He says, self-referentially) “Invention requires a long-term willingness to be misunderstood.”
Sparkey — Spotify’s open-sourced simple constant key/value storage library, for read-heavy systems with infrequent large bulk inserts.
The Truth of Fact, The Truth of Feeling (Ted Chiang) — story about what happens when lifelogs become searchable. Now with Remem, finding the exact moment has become easy, and lifelogs that previously lay all but ignored are now being scrutinized as if they were crime scenes, thickly strewn with evidence for use in domestic squabbles. (via BoingBoing)
intention.js — manipulates the DOM via HTML attributes. The methods for manipulation are placed with the elements themselves, so flexible layouts don’t seem so abstract and messy.
F1: A Distributed SQL Database That Scales — a distributed relational database system built at Google to support the AdWords business. F1 is a hybrid database that combines high availability, the scalability of NoSQL systems like Bigtable, and the consistency and usability of traditional SQL databases. F1 is built on Spanner, which provides synchronous cross-datacenter replication and strong consistency. Synchronous replication implies higher commit latency, but we mitigate that latency by using a hierarchical schema model with structured data types and through smart application design. F1 also includes a fully functional distributed SQL query engine and automatic change tracking and publishing.
Looking Inside The (Drop)Box (PDF) — This paper presents new and generic techniques, to reverse engineer frozen Python applications, which are not limited to just the Dropbox world. We describe a method to bypass Dropbox’s two factor authentication and hijack Dropbox accounts. Additionally, generic techniques to intercept SSL data using code injection techniques and monkey patching are presented. (via Tech Republic)
blinkdb — The current version of BlinkDB supports a slightly constrained set of SQL-style declarative queries and provides approximate results for standard SQL aggregate queries, specifically queries involving COUNT, AVG, SUM and PERCENTILE and is being extended to support any User-Defined Functions (UDFs). Queries involving these operations can be annotated with either an error bound, or a time constraint, based on which the system selects an appropriate sample to operate on.
China Plans to Become a Leader in Robotics (Quartz) — The ODCCC too funds high risk research initiatives through the Thousand Talent Project (TTP), a three-year term project with possible extension. The goal of the TTP is to recruit thousands of foreign researchers with strong expertise in hardware and software to help develop innovation in China. There are already more than 100 foreign researchers working in China since 2008, the year TTP started.
AppScale (GitHub) — open source implementation of Google App Engine.