ENTRIES TAGGED "databases"
Flexible Data, Google's Bottery, GPU Assist Deep Learning, and Open Sourcing
- Google’s Seven Robotics Companies (IEEE) — The seven companies are capable of creating technologies needed to build a mobile, dexterous robot. Mr. Rubin said he was pursuing additional acquisitions. Rundown of those seven companies.
- Hebel (Github) — GPU-Accelerated Deep Learning Library in Python.
- What We Learned Open Sourcing — my eye was caught by the way they offered APIs to closed source code, found and solved performance problems, then open sourced the fixed code.
- SAMOA — Yahoo!’s distributed streaming machine learning (ML) framework that contains a programming abstraction for distributed streaming ML algorithms. (via Introducing SAMOA)
- madlib — an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data.
- Data Portraits: Connecting People of Opposing Views — Yahoo! Labs research to break the filter bubble. Connect people who disagree on issue X (e.g., abortion) but who agree on issue Y (e.g., Latin American interventionism), and present the differences and similarities visually (they used wordclouds). Our results suggest that organic visualisation may revert the negative effects of providing potentially sensitive content. (via MIT Technology Review)
- Disguise Detection — using Raspberry Pi, Arduino, and Python.
Squid in the Dark, Beautiful Automation, Fan Criticism, and Petabyte Queries
- Living Light — 3D printed cephalopods filled with bioluminescent bacteria. PAGING CORY DOCTOROW, YOUR ORGASMATRON HAS ARRIVED. (via Sci Blogs)
- Repacking Lego Batteries with a CNC Mill — check out the video. Patrick programmed a CNC machine to drill out the rivets holding the Mindstorms battery pack together. Coding away a repetitive task like this is gorgeous to see at every scale. We don’t have to teach our kids a particular programming language, but they should know how to automate cruft.
- My Thoughts on Google+ (YouTube) — when your fans make hatey videos like this one protesting Google putting the pig of Google Plus onto the lipstick that was YouTube, you are Doin’ It Wrong.
- Presto: Interacting with Petabytes of Data at Facebook — a distributed SQL query engine optimized for ad-hoc analysis at interactive speed. It supports standard ANSI SQL, including complex queries, aggregations, joins, and window functions. For details, see the Facebook post about its launch.
Time Series Database, Cluster Schedulers, Structural Search-and-Replace, and TV Data
- Influx DB — open-source, distributed, time series, events, and metrics database with no external dependencies.
- Omega (PDF) — ﬂexible, scalable schedulers for large compute clusters. From Google Research.
- Amazon Mines Its Data Trove To Bet on TV’s Next Hit (WSJ) — Amazon produced about 20 pages of data detailing, among other things, how much a pilot was viewed, how many users gave it a 5-star rating and how many shared it with friends.
USB in Cars, Capture Presentations, Amazon Redshift, and Polytweeting
- Hyundia Replacing Cigarette Lighters with USB Ports (Quartz) — sign of the times. (via Julie Starr)
- Freeseer — free, open source, cross-platform application that captures or streams your desktop—designed for capturing presentations. Would you like freedom with your screencast?
- Amazon Redshift: What You Need to Know — good write-up of experience using Amazon’s column database.
- GroupTweet — Allow any number of contributors to Tweet from a group account safely and securely. (via Jenny Magiera)
Amen Break, MySQL Scale, Spooky Source, and Graph Analytics Engine
- The Amen Break (YouTube) — fascinating 20m history of the amen break, a handful of bars of drum solo from a forgotten 1969 song which became the origin of a huge amount of popular music from rap to jungle and commercials, and the contested materials at the heart of sample-based music. Remix it and weep. (via Beta Knowledge)
- The MySQL Ecosystem at Scale (PDF) — nice summary of how MySQL is used on massive users, and where the sweet spots have been found.
- Lab41 (Github) — open sourced code from a spook hacklab in Silicon Valley.
- Fanulus — open sourced Hadoop-based graph analytics engine for analyzing graphs represented across a multi-machine compute cluster. A breadth-first version of the graph traversal language Gremlin operates on graphs stored in the distributed graph database Titan, in any Rexster-fronted graph database, or in HDFS via various text and binary formats.
Insecure Hardware, Doc Database, Kids Programming, and Ad-Blocking AP
- Researchers Can Slip an Undetectable Trojan into Intel’s Ivy Bridge CPUs (Ars Technica) — The exploit works by severely reducing the amount of entropy the RNG normally uses, from 128 bits to 32 bits. The hack is similar to stacking a deck of cards during a game of Bridge. Keys generated with an altered chip would be so predictable an adversary could guess them with little time or effort required. The severely weakened RNG isn’t detected by any of the “Built-In Self-Tests” required for the P800-90 and FIPS 140-2 compliance certifications mandated by the National Institute of Standards and Technology.
- rethinkdb — open-source distributed JSON document database with a pleasant and powerful query language.
- Teach Kids Programming — a collection of resources. I start on Scratch much sooner, and 12+ definitely need the Arduino, but generally I agree with the things I recognise, and have a few to research …
- Raspberry Pi as Ad-Blocking Access Point (AdaFruit) — functionality sadly lacking from my off-the-shelf AP.
No Managers, Bezos Pearls, Visualising History, and Scalable Key-Value Store
- No Managers — If we could find a way to replace the function of the managers and focus everyone on actually producing for our Students (customers) then it would actually be possible to be a #NoManager company. In my future posts I’ll explain how we’re doing this at Treehouse.
- The 20 Smartest Things Jeff Bezos Has Ever Said (Motley Fool) — I feel like the 219th smartest thing Jeff Bezos has ever said is still smarter than the smartest thing most business commentators will ever say. (He says, self-referentially) “Invention requires a long-term willingness to be misunderstood.”
- Putting Time in Perspective — nifty representations of relative timescales and history. (via BoingBoing)
- Sophia — BSD-licensed small C library implementing an embeddable key-value database “for a high-load environment”.
Constant KV Store, Google Me, Learned Bias, and DRM-Stripping Lego Robot
- Sparkey — Spotify’s open-sourced simple constant key/value storage library, for read-heavy systems with infrequent large bulk inserts.
- The Truth of Fact, The Truth of Feeling (Ted Chiang) — story about what happens when lifelogs become searchable. Now with Remem, finding the exact moment has become easy, and lifelogs that previously lay all but ignored are now being scrutinized as if they were crime scenes, thickly strewn with evidence for use in domestic squabbles. (via BoingBoing)
- Algorithms Magnifying Misbehaviour (The Guardian) — when the training set embodies biases, the machine will exhibit biases too.
- Lego Robot That Strips DRM Off Ebooks (BoingBoing) — so. damn. cool. If it had been controlled by a C64, Cory would have hit every one of my geek erogenous zones with this find.
Flexible Layouts, Web Components, Distributed SQL Database, and Reverse-Engineering Dropbox Client
- intention.js — manipulates the DOM via HTML attributes. The methods for manipulation are placed with the elements themselves, so flexible layouts don’t seem so abstract and messy.
- F1: A Distributed SQL Database That Scales — a distributed relational database system built at Google to support the AdWords business. F1 is a hybrid database that combines high availability, the scalability of NoSQL systems like Bigtable, and the consistency and usability of traditional SQL databases. F1 is built on Spanner, which provides synchronous cross-datacenter replication and strong consistency. Synchronous replication implies higher commit latency, but we mitigate that latency by using a hierarchical schema model with structured data types and through smart application design. F1 also includes a fully functional distributed SQL query engine and automatic change tracking and publishing.
- Looking Inside The (Drop)Box (PDF) — This paper presents new and generic techniques, to reverse engineer frozen Python applications, which are not limited to just the Dropbox world. We describe a method to bypass Dropbox’s two factor authentication and hijack Dropbox accounts. Additionally, generic techniques to intercept SSL data using code injection techniques and monkey patching are presented. (via Tech Republic)