"data" entries

Four short links: 7 May 2015

Predicting Hits, Pricing Strategies, Quis Calculiet Shifty Custodes, Docker Security

Predicting a Billboard Music Hit (YouTube) — Shazam VP of Music and Platforms at Strata London. With relative accuracy, we can predict 33 days out what song will go to No. 1 on the Billboard charts in the U.S.
Psychological Pricing Strategies — a handy wrap-up of evil^wuseful pricing strategies to know.
What Two Programmers Have Revealed So Far About Seattle Police Officers Who Are Still in Uniform — through their shrewd use of Washington’s Public Records Act, the two Seattle residents are now the closest thing the city has to a civilian police-oversight board. In the last year and a half, they have acquired hundreds of reports, videos, and 911 calls related to the Seattle Police Department’s internal investigations of officer misconduct between 2010 and 2013. And though they have only combed through a small portion of the data, they say they have found several instances of officers appearing to lie, use racist language, and use excessive force—with no consequences. In fact, they believe that the Office of Professional Accountability (OPA) has systematically “run interference” for cops. In the aforementioned cases of alleged officer misconduct, all of the involved officers were exonerated and still remain on the force.
Understanding Docker Security and Best Practices — explanation of container security and a benchmark for security practices, though email addresses will need to be surrendered in exchange for the good info.

Four short links: 27 April 2015

Living Figures, Design vs Architecture, Faceted Browsing, and Byzantine Comedy

by Nat Torkington | @gnat | +Nat Torkington | April 27, 2015

‘Living Figures’ Make Their Debut (Nature) — In July last year, neurobiologist Björn Brembs published a paper about how fruit flies walk. Nine months on, his paper looks different: another group has fed its data into the article, altering one of the figures. The update — to figure 4 — marks the debut of what the paper’s London-based publisher, Faculty of 1000 (F1000), is calling a living figure, a concept that it hopes will catch on in other articles. Brembs, at the University of Regensburg in Germany, says that three other groups have so far agreed to add their data, using software he wrote that automatically redraws the figure as new data come in.
Strategies Against Architecture (Seb Chan and Aaron Straup Cope) — the story of the design of the Cooper Hewitt’s clever “pen,” which visitors to the design museum use to collect the info from their favourite exhibits. (Visit the Cooper Hewitt when you’re next in NYC; it’s magnificent.)
Two Way Street — an independent explorer for The British Museum collection, letting you browse by year acquired, year created, type of object, etc. I note there are more things from a place called “Brak” than there are from USA. Facets are awesome. (via Courtney Johnston)
The Saddest Moment (PDF) — “How can you make a reliable computer service?” the presenter will ask in an innocent voice before continuing, “It may be difficult if you can’t trust anything and the entire concept of happiness is a lie designed by unseen overlords of endless deceptive power.” The presenter never explicitly says that last part, but everybody understands what’s happening. Making distributed systems reliable is inherently impossible; we cling to Byzantine fault tolerance like Charlton Heston clings to his guns, hoping that a series of complex software protocols will somehow protect us from the oncoming storm of furious apes who have somehow learned how to wear pants and maliciously tamper with our network packets. Hilarious. (via Tracy Chou)

Four short links: 13 April 2015

Occupation Changes, Country Data, Cultural Analytics, and Dysfunctional Software Engineering Organisations

by Nat Torkington | @gnat | +Nat Torkington | April 13, 2015

The Great Reversal in the Demand for Skill and Cognitive Tasks (PDF) — The only difference with more conventional models of skill-biased technological change is our modelling of the fruits of cognitive employment as creating a stock instead of a pure flow. This slight change causes technological change to generate a boom and bust cycle, as is common in most investment models. We also incorporated into this model a standard selection process whereby individuals sort into occupations based on their comparative advantage. The selection process is the key mechanism that explains why a reduction in the demand for cognitive tasks, which are predominantly filled by higher educated workers, can result in a loss of employment concentrated among lower educated workers. While we do not claim that our model is the only structure that can explain the observations we present, we believe it gives a very simple and intuitive explanation to the changes pre- and post-2000.
provinces — state and province lists for (some) countries.
Cultural Analytics — the use of computational and visualization methods for the analysis of massive cultural data sets and flows. Interesting visualisations as well as automated understandings.
The Code is Just the Symptom — The engineering culture was a three-layer cake of dysfunction, where everyone down the chain had to execute what they knew to be an impossible task, at impossible speeds, perfectly. It was like the games of Simon Says and Telephone combined to bad effect. Most engineers will have flashbacks at these descriptions. Trigger warning: candid descriptions of real immature software organisations.

Four short links: 26 March 2015

GPU Graph Algorithms, Data Sharing, Build Like Google, and Distributed Systems Theory

by Nat Torkington | @gnat | +Nat Torkington | March 26, 2015

gunrock — a CUDA library for graph primitives that refactors, integrates, and generalizes best-of-class GPU implementations of breadth-first search, connected components, and betweenness centrality into a unified code base useful for future development of high-performance GPU graph primitives. (via Ben Lorica)
How to Share Data with a Statistician — some instruction on the best way to share data to avoid the most common pitfalls and sources of delay in the transition from data collection to data analysis.
Bazel — a build tool, i.e. a tool that will run compilers and tests to assemble your software, similar to Make, Ant, Gradle, Buck, Pants, and Maven. Google’s build tool, to be precise.
You Can’t Have Exactly-Once Delivery — not about the worst post office ever. FLP and the Two Generals Problem are not design complexities, they are impossibility results.

Four short links: 23 March 2015

Agricultural Robots, Business Model Design, Simulations, and Interoperable JSON

by Nat Torkington | @gnat | +Nat Torkington | March 23, 2015

Swarmfarm Robotics — His previous weed sprayer weighed 21 tonnes, measured 36 metres across its spray unit, guzzled diesel by the bucketload and needed a paid driver who would only work limited hours. Two robots working together on Bendee effortlessly sprayed weeds in a 70ha mung-bean crop last month. Their infra-red beams picked up any small weeds among the crop rows and sent a message to the nozzle to eject a small chemical spray. Bate hopes to soon use microwave or laser technology to kill the weeds. Best of all, the robots do the work without guidance. They work 24 hours a day. They have in-built navigation and obstacle detection, making them robust and able to decide if an area of a paddock should not be traversed. Special swarming technology means the robots can detect each other and know which part of the paddock has already been assessed and sprayed.
Route to Market (Matt Webb) — The route to market is not what makes the product good. […] So the way you design the product to best take it to market is not the same process to make it great for its users.
Explorable Explanations — points to many sweet examples of interactive explorable simulations/explanations.
I-JSON (Tim Bray) — I-JSON is just a note saying that if you construct a chunk of JSON and avoid the interop failures described in RFC 7159, you can call it an “I-JSON Message.” If any known JSON implementation creates an I-JSON message and sends it to any other known JSON implementation, the chance of software surprises is vanishingly small.

Four short links: 11 March 2015

Working Manager, Open Source Server Chassis, Data Context, and Coevolved Design & Users

by Nat Torkington | @gnat | +Nat Torkington | March 11, 2015

As a Working Manager (Ian Bicking) — I look forward to every new entry in Ian’s diary, and this one didn’t disappoint. But I’m a working manager. Is now the right time to investigate that odd log message I’m seeing, or to think about who I should talk to about product opportunities? There’s no metric to compare the priority of two tasks that are so far apart. If I am going to find time to do development I am a bit worried I have two options: (1) Keep doing programming after hours; (2) Start dropping some balls as a manager.
Introducing Yosemite (Facebook) — a modular chassis that contains high-powered system-on-a-chip (SoC) processor cards.
The Joyless World of Data-Driven Startups — There is so much invisible, fluid context wrapped around a data point that we are usually unable to fully comprehend exactly what that data represents or means. We often think we know, but we rarely do. But we really WANT it to mean something, because using data in our work is scientific. It’s not our decision that was wrong — we used the data that was available. Data is the ultimate scapegoat.
History of the Urban Dashboard — the dashboard and its user had to evolve in response to one another. The increasing complexity of the flight dashboard necessitated advanced training for pilots — particularly through new flight simulators — and new research on cockpit design.

Four short links: 3 March 2015

Wearable Warning, Time Series Data, App Cards, and Secure Comms

by Nat Torkington | @gnat | +Nat Torkington | March 3, 2015

You Guys Realize the Apple Watch is Going to Flop, Right? — leaving aside the “guys” assumption of its readers, you can take this either as a list of the challenges Apple will inevitably overcome or bypass when they release their watch, or (as intended) a list of the many reasons that it’s too damn soon for watches to be useful. The Apple Watch is Jonathan Ive’s new Newton. It’s a potentially promising form that’s being built about 10 years before Apple has the technology or infrastructure to pull it off in a meaningful way. As a result, the novel interactions that could have made the Apple watch a must-have device aren’t in the company’s launch product, nor are they on the immediate horizon. And all Apple can sell the public on is a few tweets and emails on their wrists—an attempt at a fashion statement that needs to be charged once or more a day.
InfluxDB, Now With Tags and More Unicorns — The combination of these new features [tagging, and the use of tags in queries] makes InfluxDB not just a time series database, but also a database for time series discovery. It’s our solution for making the problem of dealing with hundreds of thousands or millions of time series tractable.
The End of Apps as We Know Them — It may be very likely that the primary interface for interacting with apps will not be the app itself. The app is primarily a publishing tool. The number one way people use your app is through this notification layer, or aggregated card stream. Not by opening the app itself. To which one grumpy O’Reilly editor replied, “cards are the new walled garden.”
Signal 2.0 — Signal uses your existing phone number and address book. There are no separate logins, usernames, passwords, or PINs to manage or lose. We cannot hear your conversations or see your messages, and no one else can either. Everything in Signal is always end-to-end encrypted, and painstakingly engineered in order to keep your communication safe.

Four short links: 17 February 2015

Matthew Effects, Office Dashboards, Below the API, and Robot Economies

by Nat Torkington | @gnat | +Nat Torkington | February 17, 2015

Matthew Effects in Reading (PDF) — Walberg, following Merton, has dubbed those educational sequences where early achievement spawns faster rates of subsequent achievement “Matthew effects,” after the Gospel according to Matthew: “For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken away even that which he hath” (XXV:29) (via 2015 Troubling Trends and Possibilities in K-12)
Real Time Dashboard for Office Plumbing (Flowing Data) — this is awesome.
Working Below the API is a Dead End (Forbes) — Drivers are opting into a dichotomous workforce: the worker bees below the software layer have no opportunity for on-the-job training that advances their career, and compassionate social connections don’t pierce the software layer either. The skills they develop in driving are not an investment in their future. Once you introduce the software layer between ‘management’ (Uber’s full-time employees building the app and computer systems) and the human workers below the software layer (Uber’s drivers, Instacart’s delivery people), there’s no obvious path upwards. In fact, there’s a massive gap and no systems in place to bridge it. (via John Robb)
The Real Robot Economy and the Bus Ticket Inspector (Guardian) — None of the cinematic worries about machines that take decisions about healthcare or military action are at play here. Hidden in these everyday, mundane interactions are different moral or ethical questions about the future of AI: if a job is affected but not taken over by a robot, how and when does the new system interact with a consumer? Is it ok to turn human social intelligence – managing a difficult customer – into a commodity? Is it ok that a decision lies with a handheld device, while the human is just a mouthpiece? Where “robots” is the usual shorthand for technology that replaces manual work. (via Dan Hill)

Four short links: 30 January 2015

FAA Rules, Sports UAVs, Woodcut Data, and Concurrent Programming

by Nat Torkington | @gnat | +Nat Torkington | January 30, 2015

FAA to Regulate UAVs? (Forbes) — and the Executive Order will segment the privacy issues related to drones into two categories — public and private. For public drones (that is, drones purchased with federal dollars), the President’s order will establish a series of privacy and transparency guidelines. See also How ESPN is Shooting the X Games with Drones (Popular Mechanics)—it’s all fun and games until someone puts out their eye with a quadrocopter. The tough part will be keeping within the tight restrictions the FAA gave them. Because drones can’t be flown above a crowd, Calcinari says, “We basically had to build a 500-foot radius around them, where the public can’t go.” The drones will fly over sections of the course that are away from the crowds, where only ESPN production employees will be. That rule is part of why we haven’t seen drones at college football games.
Milestones for SaaS Companies — “Getting from $0-1m is impossible. Getting from $1-10m is unlikely. And getting from $10-100m is inevitable.” —Jason Lemkin, ex-CEO of Echosign. The article proposes some significant milestones, and they ring true. Making money is generally hard. The nature of the hard changes with the amount of money you have and the amount you’re trying to make, but if it were easy, then we’d structure our society on something else.
Woodcut Data Visualisation — Recently, I learned how to operate a laser cutter. It’s been a whole lot of fun, and I wanted to share my experiences creating woodcut data visualizations using just D3. I love it when data visualisations break out of the glass rectangle.
Why is Concurrent Programming Hard? — on the one hand there is not a single concurrency abstraction that fits all problems, and on the other hand the various different abstractions are rarely designed to be used in combination with each other. We are due for a revolution in programming, something to help us make sense of the modern systems made of more moving parts than our feeble grey matter can model and intuit about.

Four short links: 28 January 2015

Note and Vote, Gaming Behaviour, Code Search, and Immutabilate All The Things

by Nat Torkington | @gnat | +Nat Torkington | January 28, 2015

Note and Vote (Google Ventures) — nifty meeting hack to surface ideas and identify popular candidates to a decision maker.
Applying Psychology to Improve Online Behaviour — online game runs massive experiments (w/researchers to validate findings) to improve the behaviour of their players. Some of Riot’s experiments are causing the game to evolve. For example, one product is a restricted chat mode that limits the number of messages abusive players can type per match. It’s a temporary punishment that has led to a noticeable improvement in player behavior afterward —on average, individuals who went through a period of restricted chat saw 20 percent fewer abuse reports filed by other players. The restricted chat approach also proved 4 percent more effective at improving player behavior than the usual punishment method of temporarily banning toxic players. Even the smallest improvements in player behavior can make a huge difference in an online game that attracts 67 million players every month.
Hound — open source code search tool from Etsy.
Immutability Changes Everything (PDF) — This paper is simply an amuse-bouche on the repeated patterns of computing that leverage immutability.