"databases" entries

Four short links: 12 December 2014

Tech Ethics, Yahoo's KVS, Biology Inside, and Smart Luggage

by Nat Torkington | @gnat | +Nat Torkington | December 12, 2014

Do Artifacts Have Ethics? — 41 questions to ask yourself about the technology you create.
MDBM — Yahoo’s fast key-value store, in use for over a decade. Super-fast, using mmap and passing around (gasp) raw pointers.
The Revolution in Biology is Here, Now (Mike Loukides) — I’ve been asked plenty of times (and I’ve asked plenty of times), “what’s the killer product for synthetic biology?” BioFabricate convinced me that that’s the wrong question. We may never have some kind of biological iPod. That isn’t the right way to think. What I saw, instead, was real products that you might never notice. Bricks made from sand that are held together by microbes designed to excrete the binder. Bricks and packing material made from fungus (mycelium). Plastic excreted by bacteria that consume waste methane from sewage plants. You wouldn’t know, or care, whether your plastic Lego blocks are made from petroleum or from bacteria, but there’s a huge ecological difference.
Bluesmart — Indiegogo campaign for a “connected carry-on,” aka a smart suitcase. From the mobile app you can track it, learn when it’s close (or too far away), (un)lock, weigh…and you can plug your devices in and recharge from the built-in battery. Sweet!

Four short links: 17 November 2014

Tut Tut ISPs, Distributing Old Datastores, Secure Containers, and Design Workflow

by Nat Torkington | @gnat | +Nat Torkington | November 17, 2014

ISPs Remove Their Customers’ Email Encryption (EFF) — ISPs have apparently realised that man-in-the-middle is their business model.
Dynomite (Netflix) — a sharding and replication layer. Dynomite can make existing non-distributed datastores, such as Redis or Memcached, into a fully distributed & multi-datacenter replicating datastore.
After Docker — smaller, easier to manage, more secure containers via unikernels and immutable infrastructure.
Pixelapse — something between Dropbox and Github for the design workflow and artifacts.

Four short links: 12 November 2014

Material Design, Inflatable Robots, Printable Awesome, and Graph Modelling

by Nat Torkington | @gnat | +Nat Torkington | November 12, 2014

CSS and React to Implement Material Design — as I said earlier, it will be interesting to see if Material Design becomes a common UI style for the web.
Current State of Inflatable Robots — I’d missed the amazing steps forward in control that were made in pneumatic robots. Check out the OtherLab tentacle!
Dinosaur Skull Showerhead — 3D-printable add-on to your shower. (via Archie McPhee)
Data Modelling in Graph Databases — how to build the graph structure by working back from the questions you’ll ask of it.

Four short links: 11 November 2014

High-Volume Logs, Regulated Broadband, Oculus Web, and Personal Data Vacuums

by Nat Torkington | @gnat | +Nat Torkington | November 11, 2014

Infrastructure for Data Streams — describing the high-volume log data use case for Apache Kafka, and how it plays out in storage and infrastructure.
Obama: Treat Broadband and Mobile as Utility (Ars Technica) — In short, Obama is siding with consumer advocates who have lobbied for months in favor of reclassification while the telecommunications industry lobbied against it.
MozVR — a website, and the tools that made it, designed to be seen through the Oculus Rift.
All Cameras are Police Cameras (James Bridle) — how the slippery slope is ridden: When the Wall was initially constructed, the public were informed that this [automatic license plate recognition] data would only be held, and regularly purged, by Transport for London, who oversee traffic matters in the city. However, within less than five years, the Home Secretary gave the Metropolitan Police full access to this system, which allowed them to take a complete copy of the data produced by the system. This permission to access the data was granted to the Police on the sole condition that they only used it when National Security was under threat. But since the data was now in their possession, the Police reclassified it as “Crime” data and now use it for general policing matters, despite the wording of the original permission. As this data is not considered to be “personal data” within the definition of the law, the Police are under no obligation to destroy it, and may retain their ongoing record of all vehicle movements within the city for as long as they desire.

Four short links: 13 October 2014

Angular Style, Consensus Filters, BASE Banks, and Browser Performance

by Nat Torkington | @gnat | +Nat Torkington | October 13, 2014

Angular JS Style Guide — I love style guides, to the point of having posted (I think) three for Angular. Reading other people’s style guides is like listening to them make-up after arguments: you learn what’s important to them, and what they regret.
Consensus Filters — filtering out misreads and other errors to allow all agents, or robots, in the network to arrive at the same value asymptotically by only communicating with their neighbours.
Why Banks are BASE not ACID — Consistency it turns out is not the Holy Grail. What trumps consistency is: Auditing, Risk Management, Availability.
perfmap — front-end performance heatmap.

Four short links: 29 August 2014

Delivery Drones, Database Readings, Digital Govt, and GitHub Reviews

by Nat Torkington | @gnat | +Nat Torkington | August 29, 2014

Inside Google’s Secret Drone Delivery Program (The Atlantic) — passed proof-of-concept in Western Australia, two years into development.
Readings in Databases — A list of papers essential to understanding databases and building new data systems. (via Hacker News)
Todd Park Recruiting for Govt Digital Corps (Wired) — “America needs you!” he said to the crowd. “Not a year from now! But Right. The. Fuck. Now!”
Review Ninja — a lightweight code review tool that works with GitHub, providing a more structured way to use pull requests for code review. ReviewNinja dispenses with elaborate voting systems, and supports hassle-free committing and merging for acceptable changes.

Four short links: 27 August 2014

Discourse 1.0, Programmable Matter, Versioned Databases, and What Humans Learned About Machine Learning

by Nat Torkington | @gnat | +Nat Torkington | August 27, 2014

Discourse turns 1.0 — community/forum software that doesn’t suck.
Programmable Matter (IEEE Spectrum) — recap of where research is going in this area.
Liquibase — source control for your database. Apache 2.0 licensed.
A Few Useful Things to Know About Machine Learning (PDF) — This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions. My fave: First-timers are often surprised by how little time in a machine learning project is spent actually doing machine learning. But it makes sense if you consider how time-consuming it is to gather data, integrate it, clean it and pre-process it, and how much trial and error can go into feature design.

How Flash changes the design of database storage engines

High-performing memory throws many traditional decisions overboard

by Andy Oram | @praxagora | +Andy Oram | August 22, 2014

Over the past decade, SSD drives (popularly known as Flash) have radically changed computing at both the consumer level — where USB sticks have effectively replaced CDs for transporting files — and the server level, where it offers a price/performance ratio radically different from both RAM and disk drives. But databases have just started to catch up during the past few years. Most still depend on internal data structures and storage management fine-tuned for spinning disks.

Citing price and performance, one author advised a wide range of database vendors to move to Flash. Certainly, a database administrator can speed up old databases just by swapping out disk drives and inserting Flash, but doing so captures just a sliver of the potential performance improvement promised by Flash. For this article, I asked several database experts — including representatives of Aerospike, Cassandra, FoundationDB, RethinkDB, and Tokutek — how Flash changes the design of storage engines for databases. The various ways these companies have responded to its promise in their database designs are instructive to readers designing applications and looking for the best storage solutions.

Read more…

Four short links: 18 August 2014

Space Trading, Robot Capitalism, Packet Injection, and CAP Theorem

by Nat Torkington | @gnat | +Nat Torkington | August 18, 2014

Oolite — open-source clone of Elite, the classic space trading game from the 80s.
Who Owns the Robots Rules The World (PDF) — interesting finding: As companies substitute machines and computers for human activity, workers need to own part of the capital stock that substitutes for them to benefit from these new “robot” technologies. Workers could own shares of the firm, hold stock options, or be paid in part from the profits. Without ownership stakes, workers will become serfs working on behalf of the robots’ overlords. Governments could tax the wealthy capital owners and redistribute income to workers, but that is not the direction societies are moving in. Workers need to own capital rather than rely on government income redistribution policies. (via Robotenomics)
Schrodinger’s Cat Video and the Death of Clear-Text (Morgan Marquis-Boire) — report, based on leaked information, about use of network injection appliances targeted unencrypted pages from major providers. Compromising a target becomes as simple as waiting for the user to view unencrypted content on the Internet.
CAP 12 Years Later: How the Rules Have Changed — a rundown of strategies available to deal with partitions (“outages”) in a distributed system.

Four short links: 6 August 2014

Mesa Database, Thumbstoppers, Impressive Research, and Microsoft Development

by Nat Torkington | @gnat | +Nat Torkington | August 6, 2014

Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing (PDF) — paper by Googlers on the database holding G’s ad data. Trillions of rows, petabytes of data, point queries with 99th percentile latency in the hundreds of milliseconds and overall query throughput of trillions of rows fetched per day, continuous updates on the order of millions of rows updated per second, strong consistency and repeatable query results even if a query involves multiple datacenters, and no SPOF. (via Greg Linden)
Thumbstopping (Salon) — The prime goal of a Facebook ad campaign is to create an ad “so compelling that it would get people to stop scrolling through their news feeds,” reports the Times. This is known, in Facebook land, as a “thumbstopper.” And thus, the great promise of the digitial revolution is realized: The best minds of our generation are obsessed with manipulating the movement of your thumb on a smartphone touch-screen.
om3d — pose a model based on its occurrence in a photo, then update the photo after rotating and re-rendering the model. Research is doing some sweet things these days—this comes hot on the heels of recovering sounds from high-speed video of things like chip bags.
Microsoft’s Development Practices (Ars Technica) — they get the devops religion but call it “combined engineering”. They get the idea of shared code bases, but call it “open source”. At least when they got the agile religion, they called it that. Check out the horror story of where they started: a two-year development process in which only about four months would be spent writing new code. Twice as long would be spent fixing that code. MSFT’s waterfall was the equivalent of American football, where there’s 11 minutes of actual play in the average 3h 12m game.