"databases" entries

Four short links: 10 February 2015

Four short links: 10 February 2015

Speech Recognition, Predictive Analytic Queries, Video Chat, and Javascript UI Library

  1. The Uncanny Valley of Speech Recognition (Zach Holman) — I’m reminded of driving up US-280 in 2003 or so with @raelity, a Kiwi and a South African trying every permutation of American accent from Kentucky to Yosemite Sam in order to get TellMe to stop giving us the weather for zipcode 10000. It didn’t recognise the swearing either. (Caution: features similarly strong language.)
  2. TuPAQ: An Efficient Planner for Large-scale Predictive Analytic Queries (PDF) — an integrated PAQ [Predictive Analytic Queries] planning architecture that combines advanced model search techniques, bandit resource allocation via runtime algorithm introspection, and physical optimization via batching. The resulting system, TUPAQ, solves the PAQ planning problem with comparable accuracy to exhaustive strategies but an order of magnitude faster, and can scale to models trained on terabytes of data across hundreds of machines.
  3. p2pvc — point-to-point video chat. In an 80×25 terminal window.
  4. Sortable — nifty UI library.
Comment
Four short links: 14 January 2015

Four short links: 14 January 2015

IoT and Govt, Exactly Once, Random Database Subset, and UX Checking

  1. Internet of Things: Blackett Review — the British Government’s review of Internet of Things opportunities around government. Government and others can use expert commissioning to encourage participants in demonstrator programmes to develop standards that facilitate interoperable and secure systems. Government as a large purchaser of IoT systems is going to have a big impact if it buys wisely. (via Matt Webb)
  2. Exactly Once Semantics with Kafka — designing for failure means it’s easier to ensure that things get done than it is to ensure that things get done exactly once.
  3. rdbms-subsetter — open source tool to generate a random sample of rows from a relational database that preserves referential integrity – so long as constraints are defined, all parent rows will exist for child rows. (via 18F)
  4. UXcheck — a browser extension to help you do a quick UX check against Nielsen’s 10 principles.
Comment
Four short links: 12 December 2014

Four short links: 12 December 2014

Tech Ethics, Yahoo's KVS, Biology Inside, and Smart Luggage

  1. Do Artifacts Have Ethics? — 41 questions to ask yourself about the technology you create.
  2. MDBM — Yahoo’s fast key-value store, in use for over a decade. Super-fast, using mmap and passing around (gasp) raw pointers.
  3. The Revolution in Biology is Here, Now (Mike Loukides) — I’ve been asked plenty of times (and I’ve asked plenty of times), “what’s the killer product for synthetic biology?” BioFabricate convinced me that that’s the wrong question. We may never have some kind of biological iPod. That isn’t the right way to think. What I saw, instead, was real products that you might never notice. Bricks made from sand that are held together by microbes designed to excrete the binder. Bricks and packing material made from fungus (mycelium). Plastic excreted by bacteria that consume waste methane from sewage plants. You wouldn’t know, or care, whether your plastic Lego blocks are made from petroleum or from bacteria, but there’s a huge ecological difference.
  4. Bluesmart — Indiegogo campaign for a “connected carry-on,” aka a smart suitcase. From the mobile app you can track it, learn when it’s close (or too far away), (un)lock, weigh…and you can plug your devices in and recharge from the built-in battery. Sweet!
Comment
Four short links: 17 November 2014

Four short links: 17 November 2014

Tut Tut ISPs, Distributing Old Datastores, Secure Containers, and Design Workflow

  1. ISPs Remove Their Customers’ Email Encryption (EFF) — ISPs have apparently realised that man-in-the-middle is their business model.
  2. Dynomite (Netflix) — a sharding and replication layer. Dynomite can make existing non-distributed datastores, such as Redis or Memcached, into a fully distributed & multi-datacenter replicating datastore.
  3. After Dockersmaller, easier to manage, more secure containers via unikernels and immutable infrastructure.
  4. Pixelapse — something between Dropbox and Github for the design workflow and artifacts.
Comment: 1
Four short links: 12 November 2014

Four short links: 12 November 2014

Material Design, Inflatable Robots, Printable Awesome, and Graph Modelling

  1. CSS and React to Implement Material Design — as I said earlier, it will be interesting to see if Material Design becomes a common UI style for the web.
  2. Current State of Inflatable Robots — I’d missed the amazing steps forward in control that were made in pneumatic robots. Check out the OtherLab tentacle!
  3. Dinosaur Skull Showerhead — 3D-printable add-on to your shower. (via Archie McPhee)
  4. Data Modelling in Graph Databases — how to build the graph structure by working back from the questions you’ll ask of it.
Comment
Four short links: 11 November 2014

Four short links: 11 November 2014

High-Volume Logs, Regulated Broadband, Oculus Web, and Personal Data Vacuums

  1. Infrastructure for Data Streams — describing the high-volume log data use case for Apache Kafka, and how it plays out in storage and infrastructure.
  2. Obama: Treat Broadband and Mobile as Utility (Ars Technica) — In short, Obama is siding with consumer advocates who have lobbied for months in favor of reclassification while the telecommunications industry lobbied against it.
  3. MozVR — a website, and the tools that made it, designed to be seen through the Oculus Rift.
  4. All Cameras are Police Cameras (James Bridle) — how the slippery slope is ridden: When the Wall was initially constructed, the public were informed that this [automatic license plate recognition] data would only be held, and regularly purged, by Transport for London, who oversee traffic matters in the city. However, within less than five years, the Home Secretary gave the Metropolitan Police full access to this system, which allowed them to take a complete copy of the data produced by the system. This permission to access the data was granted to the Police on the sole condition that they only used it when National Security was under threat. But since the data was now in their possession, the Police reclassified it as “Crime” data and now use it for general policing matters, despite the wording of the original permission. As this data is not considered to be “personal data” within the definition of the law, the Police are under no obligation to destroy it, and may retain their ongoing record of all vehicle movements within the city for as long as they desire.
Comment
Four short links: 13 October 2014

Four short links: 13 October 2014

Angular Style, Consensus Filters, BASE Banks, and Browser Performance

  1. Angular JS Style Guide — I love style guides, to the point of having posted (I think) three for Angular. Reading other people’s style guides is like listening to them make-up after arguments: you learn what’s important to them, and what they regret.
  2. Consensus Filters — filtering out misreads and other errors to allow all agents, or robots, in the network to arrive at the same value asymptotically by only communicating with their neighbours.
  3. Why Banks are BASE not ACIDConsistency it turns out is not the Holy Grail. What trumps consistency is: Auditing, Risk Management, Availability.
  4. perfmap — front-end performance heatmap.
Comment
Four short links: 29 August 2014

Four short links: 29 August 2014

Delivery Drones, Database Readings, Digital Govt, and GitHub Reviews

  1. Inside Google’s Secret Drone Delivery Program (The Atlantic) — passed proof-of-concept in Western Australia, two years into development.
  2. Readings in DatabasesA list of papers essential to understanding databases and building new data systems. (via Hacker News)
  3. Todd Park Recruiting for Govt Digital Corps (Wired) — “America needs you!” he said to the crowd. “Not a year from now! But Right. The. Fuck. Now!”
  4. Review Ninjaa lightweight code review tool that works with GitHub, providing a more structured way to use pull requests for code review. ReviewNinja dispenses with elaborate voting systems, and supports hassle-free committing and merging for acceptable changes.
Comments: 2
Four short links: 27 August 2014

Four short links: 27 August 2014

Discourse 1.0, Programmable Matter, Versioned Databases, and What Humans Learned About Machine Learning

  1. Discourse turns 1.0 — community/forum software that doesn’t suck.
  2. Programmable Matter (IEEE Spectrum) — recap of where research is going in this area.
  3. Liquibasesource control for your database. Apache 2.0 licensed.
  4. A Few Useful Things to Know About Machine Learning (PDF) — This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions. My fave: First-timers are often surprised by how little time in a machine learning project is spent actually doing machine learning. But it makes sense if you consider how time-consuming it is to gather data, integrate it, clean it and pre-process it, and how much trial and error can go into feature design.
Comments: 2

How Flash changes the design of database storage engines

High-performing memory throws many traditional decisions overboard

supermicro_storage

Over the past decade, SSD drives (popularly known as Flash) have radically changed computing at both the consumer level — where USB sticks have effectively replaced CDs for transporting files — and the server level, where it offers a price/performance ratio radically different from both RAM and disk drives. But databases have just started to catch up during the past few years. Most still depend on internal data structures and storage management fine-tuned for spinning disks.

Citing price and performance, one author advised a wide range of database vendors to move to Flash. Certainly, a database administrator can speed up old databases just by swapping out disk drives and inserting Flash, but doing so captures just a sliver of the potential performance improvement promised by Flash. For this article, I asked several database experts — including representatives of Aerospike, Cassandra, FoundationDB, RethinkDB, and Tokutek — how Flash changes the design of storage engines for databases. The various ways these companies have responded to its promise in their database designs are instructive to readers designing applications and looking for the best storage solutions.

Read more…

Comments: 2