ENTRIES TAGGED "databases"

Four short links: 18 August 2014

Four short links: 18 August 2014

Space Trading, Robot Capitalism, Packet Injection, and CAP Theorem

  1. Oolite — open-source clone of Elite, the classic space trading game from the 80s.
  2. Who Owns the Robots Rules The World (PDF) — interesting finding: As companies substitute machines and computers for human activity, workers need to own part of the capital stock that substitutes for them to benefit from these new “robot” technologies. Workers could own shares of the firm, hold stock options, or be paid in part from the profits. Without ownership stakes, workers will become serfs working on behalf of the robots’ overlords. Governments could tax the wealthy capital owners and redistribute income to workers, but that is not the direction societies are moving in. Workers need to own capital rather than rely on government income redistribution policies. (via Robotenomics)
  3. Schrodinger’s Cat Video and the Death of Clear-Text (Morgan Marquis-Boire) — report, based on leaked information, about use of network injection appliances targeted unencrypted pages from major providers. Compromising a target becomes as simple as waiting for the user to view unencrypted content on the Internet.
  4. CAP 12 Years Later: How the Rules Have Changed — a rundown of strategies available to deal with partitions (“outages”) in a distributed system.
Comment
Four short links: 6 August 2014

Four short links: 6 August 2014

Mesa Database, Thumbstoppers, Impressive Research, and Microsoft Development

  1. Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing (PDF) — paper by Googlers on the database holding G’s ad data. Trillions of rows, petabytes of data, point queries with 99th percentile latency in the hundreds of milliseconds and overall query throughput of trillions of rows fetched per day, continuous updates on the order of millions of rows updated per second, strong consistency and repeatable query results even if a query involves multiple datacenters, and no SPOF. (via Greg Linden)
  2. Thumbstopping (Salon) — The prime goal of a Facebook ad campaign is to create an ad “so compelling that it would get people to stop scrolling through their news feeds,” reports the Times. This is known, in Facebook land, as a “thumbstopper.” And thus, the great promise of the digitial revolution is realized: The best minds of our generation are obsessed with manipulating the movement of your thumb on a smartphone touch-screen.
  3. om3d — pose a model based on its occurrence in a photo, then update the photo after rotating and re-rendering the model. Research is doing some sweet things these days—this comes hot on the heels of recovering sounds from high-speed video of things like chip bags.
  4. Microsoft’s Development Practices (Ars Technica) — they get the devops religion but call it “combined engineering”. They get the idea of shared code bases, but call it “open source”. At least when they got the agile religion, they called it that. Check out the horror story of where they started: a two-year development process in which only about four months would be spent writing new code. Twice as long would be spent fixing that code. MSFT’s waterfall was the equivalent of American football, where there’s 11 minutes of actual play in the average 3h 12m game.
Comment

Graph tools forge path to new solutions

Find emergent properties and solutions to new computing problems with graphs

alchemyjsGraph databases haven’t made the news much because, I think, they don’t fit in convenient categories. They certainly aren’t the relational databases we’re all familiar with, nor are they the arbitrary keys and values provided by many NoSQL stores. But in a highly connected world–where it’s not what you know but whom you know–it makes intuitive sense to arrange our knowledge as nodes and edges.

Ted Nelson, inventor of the hyperlink, recognized the power of viewing life in graphs. After the implosion of his historic Xanadu project, he embarked on a graph database tool called ZigZag. The most modern instantiations of graphs–the Neo4j store and the Alchemy.js tool for interactively visualizing graphs–were well represented this year at O’Reilly’s Open Source convention.

Read more…

Comment
Four short links: 22 July 2014

Four short links: 22 July 2014

English lint, Scalable Replicated Datastore, There's People in my Software, and Sci-Fi for Ethics

  1. write-gooda naive `lint’ for English prose.
  2. cockroachdba scalable, geo-replicated, transactional datastore from a team that includes the person who built Spanner for Google. Spanner requires atomic clocks, cockroach does not (which has corresponding performance consequences). (via Wired)
  3. The Deep Convergence of Networks, Software, and Peopleas we wire up our digital products increasingly with interconnected networks, their nature is increasingly a product of the responses that come back from those networks. The experience cannot be wholly represented in mock prototypes that are coded to respond in predictable ways, or even using a set of preset random responses. The power of the application is seeing the emergent behaviour of the system, and recognizing that you are a participant in that emergent behaviour. (via Tim O’Reilly)
  4. An Ethics Class for Inventors, via Sci-Fi“Reading science fiction is kind of like ethics class for inventors,” says Brueckner. Traditionally, technology schools ask ‘how do we build it?’ This class asks a different question: ‘should we?’
Comments: 2
Four short links: 9 June 2014

Four short links: 9 June 2014

SQL against Text, Fake Social Networks, Hidden Biases, and Versioned Data

  1. textqlexecute SQL against structured text like CSV or TSV.
  2. Social Network Structure of Fake Friends — author bought 4,000 Twitter followers and studied their relationships.
  3. Hidden Biases in Big Datawith every big data set, we need to ask which people are excluded. Which places are less visible? What happens if you live in the shadow of big data sets? (via Quinn Norton)
  4. CoreObjecta version-controlled object database for Objective-C that supports powerful undo, semantic merging, and real-time collaborative editing.
Comment
Four short links: 3 June 2014

Four short links: 3 June 2014

Machine Learning Mistakes, Recommendation Bandits, Droplet Robots, and Plain English

  1. Machine Learning Done Wrong[M]ost practitioners pick the modeling algorithm they are most familiar with rather than pick the one which best suits the data. In this post, I would like to share some common mistakes (the don’t-s).
  2. Bandits for RecommendationsA common problem for internet-based companies is: which piece of content should we display? Google has this problem (which ad to show), Facebook has this problem (which friend’s post to show), and RichRelevance has this problem (which product recommendation to show). Many of the promising solutions come from the study of the multi-armed bandit problem.
  3. Dropletsthe Droplet is almost spherical, can self-right after being poured out of a bucket, and has the hardware capabilities to organize into complex shapes with its neighbors due to accurate range and bearing. Droplets are available open-source and use cheap vibration motors and a 3D printed shell. (via Robohub)
  4. Apple’s App Store Approval Guidelines — some of the plainest English I’ve seen, especially the Introduction. I can only aspire to that clarity. If your App looks like it was cobbled together in a few days, or you’re trying to get your first practice App into the store to impress your friends, please brace yourself for rejection. We have lots of serious developers who don’t want their quality Apps to be surrounded by amateur hour.
Comment
Four short links: 30 May 2014

Four short links: 30 May 2014

Video Transparency, Software Traffic, Distributed Database, and Open Source Sustainability

  1. Video Quality Report — transparency is a great way to indirectly exert leverage.
  2. Control Your Traffic Flows with Software — using BGP to balance traffic. Will be interesting to see how the more extreme traffic managers deploy SDN in the data center.
  3. Cockroacha distributed key/value datastore which supports ACID transactional semantics and versioned values as first-class features. The primary design goal is global consistency and survivability, hence the name. Cockroach aims to tolerate disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention. Cockroach nodes are symmetric; a design goal is one binary with minimal configuration and no required auxiliary services.
  4. Linux Foundation Providing for Core Infrastructure Projects — press release, but interested in how they’re tackling sustainability—they’re taking on identifying worthies (glad I’m not the one who says “you’re not worthy” to a project) and being the non-profit conduit for the dosh. Interesting: implies they think the reason companies weren’t supporting necessary open source projects was some combination of being unsure who to support (projects you use, surely?) and how to get them money (ask?). (Sustainability of open source projects is a pet interest of mine)
Comment
Four short links: 24 January 2014

Four short links: 24 January 2014

Floating Point, Secure Distributed FS, Cloud Robotics, and Domestic Sensors

  1. What Every Computer Scientist Should Know About Floating Point Arithmetic — in short, “it will hurt you.”
  2. Ori a distributed file system built for offline operation and empowers the user with control over synchronization operations and conflict resolution. We provide history through light weight snapshots and allow users to verify the history has not been tampered with. Through the use of replication instances can be resilient and recover damaged data from other nodes.
  3. RoboEartha Cloud Robotics infrastructure, which includes everything needed to close the loop from robot to the cloud and back to the robot. RoboEarth’s World-Wide-Web style database stores knowledge generated by humans – and robots – in a machine-readable format. Data stored in the RoboEarth knowledge base include software components, maps for navigation (e.g., object locations, world models), task knowledge (e.g., action recipes, manipulation strategies), and object recognition models (e.g., images, object models).
  4. Mother — domestic sensors and an app with an appallingly presumptuous name. (Also, wasn’t “Mother” the name of the ship computer in Alien?) (via BoingBoing)
Comment: 1
Four short links: 10 December 2013

Four short links: 10 December 2013

Flexible Data, Google's Bottery, GPU Assist Deep Learning, and Open Sourcing

  1. ArangoDBopen-source database with a flexible data model for documents, graphs, and key-values. Build high performance applications using a convenient sql-like query language or JavaScript extensions.
  2. Google’s Seven Robotics Companies (IEEE) — The seven companies are capable of creating technologies needed to build a mobile, dexterous robot. Mr. Rubin said he was pursuing additional acquisitions. Rundown of those seven companies.
  3. Hebel (Github) — GPU-Accelerated Deep Learning Library in Python.
  4. What We Learned Open Sourcing — my eye was caught by the way they offered APIs to closed source code, found and solved performance problems, then open sourced the fixed code.
Comment: 1
Four short links: 3 December 2013

Four short links: 3 December 2013

  1. SAMOA — Yahoo!’s distributed streaming machine learning (ML) framework that contains a programming abstraction for distributed streaming ML algorithms. (via Introducing SAMOA)
  2. madliban open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data.
  3. Data Portraits: Connecting People of Opposing Views — Yahoo! Labs research to break the filter bubble. Connect people who disagree on issue X (e.g., abortion) but who agree on issue Y (e.g., Latin American interventionism), and present the differences and similarities visually (they used wordclouds). Our results suggest that organic visualisation may revert the negative effects of providing potentially sensitive content. (via MIT Technology Review)
  4. Disguise Detection — using Raspberry Pi, Arduino, and Python.
Comment