"data" entries

Four short links: 13 April 2015

Four short links: 13 April 2015

Occupation Changes, Country Data, Cultural Analytics, and Dysfunctional Software Engineering Organisations

  1. The Great Reversal in the Demand for Skill and Cognitive Tasks (PDF) — The only difference with more conventional models of skill-biased technological change is our modelling of the fruits of cognitive employment as creating a stock instead of a pure flow. This slight change causes technological change to generate a boom and bust cycle, as is common in most investment models. We also incorporated into this model a standard selection process whereby individuals sort into occupations based on their comparative advantage. The selection process is the key mechanism that explains why a reduction in the demand for cognitive tasks, which are predominantly filled by higher educated workers, can result in a loss of employment concentrated among lower educated workers. While we do not claim that our model is the only structure that can explain the observations we present, we believe it gives a very simple and intuitive explanation to the changes pre- and post-2000.
  2. provinces — state and province lists for (some) countries.
  3. Cultural Analyticsthe use of computational and visualization methods for the analysis of massive cultural data sets and flows. Interesting visualisations as well as automated understandings.
  4. The Code is Just the SymptomThe engineering culture was a three-layer cake of dysfunction, where everyone down the chain had to execute what they knew to be an impossible task, at impossible speeds, perfectly. It was like the games of Simon Says and Telephone combined to bad effect. Most engineers will have flashbacks at these descriptions. Trigger warning: candid descriptions of real immature software organisations.
Comment
Four short links: 26 March 2015

Four short links: 26 March 2015

GPU Graph Algorithms, Data Sharing, Build Like Google, and Distributed Systems Theory

  1. gunrocka CUDA library for graph primitives that refactors, integrates, and generalizes best-of-class GPU implementations of breadth-first search, connected components, and betweenness centrality into a unified code base useful for future development of high-performance GPU graph primitives. (via Ben Lorica)
  2. How to Share Data with a Statisticiansome instruction on the best way to share data to avoid the most common pitfalls and sources of delay in the transition from data collection to data analysis.
  3. Bazela build tool, i.e. a tool that will run compilers and tests to assemble your software, similar to Make, Ant, Gradle, Buck, Pants, and Maven. Google’s build tool, to be precise.
  4. You Can’t Have Exactly-Once Delivery — not about the worst post office ever. FLP and the Two Generals Problem are not design complexities, they are impossibility results.
Comment
Four short links: 23 March 2015

Four short links: 23 March 2015

Agricultural Robots, Business Model Design, Simulations, and Interoperable JSON

  1. Swarmfarm RoboticsHis previous weed sprayer weighed 21 tonnes, measured 36 metres across its spray unit, guzzled diesel by the bucketload and needed a paid driver who would only work limited hours. Two robots working together on Bendee effortlessly sprayed weeds in a 70ha mung-bean crop last month. Their infra-red beams picked up any small weeds among the crop rows and sent a message to the nozzle to eject a small chemical spray. Bate hopes to soon use microwave or laser technology to kill the weeds. Best of all, the robots do the work without guidance. They work 24 hours a day. They have in-built navigation and obstacle detection, making them robust and able to decide if an area of a paddock should not be traversed. Special swarming technology means the robots can detect each other and know which part of the paddock has already been assessed and sprayed.
  2. Route to Market (Matt Webb) — The route to market is not what makes the product good. […] So the way you design the product to best take it to market is not the same process to make it great for its users.
  3. Explorable Explanations — points to many sweet examples of interactive explorable simulations/explanations.
  4. I-JSON (Tim Bray) — I-JSON is just a note saying that if you construct a chunk of JSON and avoid the interop failures described in RFC 7159, you can call it an “I-JSON Message.” If any known JSON implementation creates an I-JSON message and sends it to any other known JSON implementation, the chance of software surprises is vanishingly small.
Comment
Four short links: 11 March 2015

Four short links: 11 March 2015

Working Manager, Open Source Server Chassis, Data Context, and Coevolved Design & Users

  1. As a Working Manager (Ian Bicking) — I look forward to every new entry in Ian’s diary, and this one didn’t disappoint. But I’m a working manager. Is now the right time to investigate that odd log message I’m seeing, or to think about who I should talk to about product opportunities? There’s no metric to compare the priority of two tasks that are so far apart. If I am going to find time to do development I am a bit worried I have two options: (1) Keep doing programming after hours; (2) Start dropping some balls as a manager.
  2. Introducing Yosemite (Facebook) — a modular chassis that contains high-powered system-on-a-chip (SoC) processor cards.
  3. The Joyless World of Data-Driven StartupsThere is so much invisible, fluid context wrapped around a data point that we are usually unable to fully comprehend exactly what that data represents or means. We often think we know, but we rarely do. But we really WANT it to mean something, because using data in our work is scientific. It’s not our decision that was wrong — we used the data that was available. Data is the ultimate scapegoat.
  4. History of the Urban Dashboardthe dashboard and its user had to evolve in response to one another. The increasing complexity of the flight dashboard necessitated advanced training for pilots — particularly through new flight simulators — and new research on cockpit design.
Comment
Four short links: 3 March 2015

Four short links: 3 March 2015

Wearable Warning, Time Series Data, App Cards, and Secure Comms

  1. You Guys Realize the Apple Watch is Going to Flop, Right? — leaving aside the “guys” assumption of its readers, you can take this either as a list of the challenges Apple will inevitably overcome or bypass when they release their watch, or (as intended) a list of the many reasons that it’s too damn soon for watches to be useful. The Apple Watch is Jonathan Ive’s new Newton. It’s a potentially promising form that’s being built about 10 years before Apple has the technology or infrastructure to pull it off in a meaningful way. As a result, the novel interactions that could have made the Apple watch a must-have device aren’t in the company’s launch product, nor are they on the immediate horizon. And all Apple can sell the public on is a few tweets and emails on their wrists—an attempt at a fashion statement that needs to be charged once or more a day.
  2. InfluxDB, Now With Tags and More UnicornsThe combination of these new features [tagging, and the use of tags in queries] makes InfluxDB not just a time series database, but also a database for time series discovery. It’s our solution for making the problem of dealing with hundreds of thousands or millions of time series tractable.
  3. The End of Apps as We Know ThemIt may be very likely that the primary interface for interacting with apps will not be the app itself. The app is primarily a publishing tool. The number one way people use your app is through this notification layer, or aggregated card stream. Not by opening the app itself. To which one grumpy O’Reilly editor replied, “cards are the new walled garden.”
  4. Signal 2.0Signal uses your existing phone number and address book. There are no separate logins, usernames, passwords, or PINs to manage or lose. We cannot hear your conversations or see your messages, and no one else can either. Everything in Signal is always end-to-end encrypted, and painstakingly engineered in order to keep your communication safe.
Comment
Four short links: 17 February 2015

Four short links: 17 February 2015

Matthew Effects, Office Dashboards, Below the API, and Robot Economies

  1. Matthew Effects in Reading (PDF) — Walberg, following Merton, has dubbed those educational sequences where early achievement spawns faster rates of subsequent achievement “Matthew effects,” after the Gospel according to Matthew: “For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken away even that which he hath” (XXV:29) (via 2015 Troubling Trends and Possibilities in K-12)
  2. Real Time Dashboard for Office Plumbing (Flowing Data) — this is awesome.
  3. Working Below the API is a Dead End (Forbes) — Drivers are opting into a dichotomous workforce: the worker bees below the software layer have no opportunity for on-the-job training that advances their career, and compassionate social connections don’t pierce the software layer either. The skills they develop in driving are not an investment in their future. Once you introduce the software layer between ‘management’ (Uber’s full-time employees building the app and computer systems) and the human workers below the software layer (Uber’s drivers, Instacart’s delivery people), there’s no obvious path upwards. In fact, there’s a massive gap and no systems in place to bridge it. (via John Robb)
  4. The Real Robot Economy and the Bus Ticket Inspector (Guardian) — None of the cinematic worries about machines that take decisions about healthcare or military action are at play here. Hidden in these everyday, mundane interactions are different moral or ethical questions about the future of AI: if a job is affected but not taken over by a robot, how and when does the new system interact with a consumer? Is it ok to turn human social intelligence – managing a difficult customer – into a commodity? Is it ok that a decision lies with a handheld device, while the human is just a mouthpiece? Where “robots” is the usual shorthand for technology that replaces manual work. (via Dan Hill)
Comment
Four short links: 30 January 2015

Four short links: 30 January 2015

FAA Rules, Sports UAVs, Woodcut Data, and Concurrent Programming

  1. FAA to Regulate UAVs? (Forbes) — and the Executive Order will segment the privacy issues related to drones into two categories — public and private. For public drones (that is, drones purchased with federal dollars), the President’s order will establish a series of privacy and transparency guidelines. See also How ESPN is Shooting the X Games with Drones (Popular Mechanics)—it’s all fun and games until someone puts out their eye with a quadrocopter. The tough part will be keeping within the tight restrictions the FAA gave them. Because drones can’t be flown above a crowd, Calcinari says, “We basically had to build a 500-foot radius around them, where the public can’t go.” The drones will fly over sections of the course that are away from the crowds, where only ESPN production employees will be. That rule is part of why we haven’t seen drones at college football games.
  2. Milestones for SaaS Companies“Getting from $0-1m is impossible. Getting from $1-10m is unlikely. And getting from $10-100m is inevitable.” —Jason Lemkin, ex-CEO of Echosign. The article proposes some significant milestones, and they ring true. Making money is generally hard. The nature of the hard changes with the amount of money you have and the amount you’re trying to make, but if it were easy, then we’d structure our society on something else.
  3. Woodcut Data VisualisationRecently, I learned how to operate a laser cutter. It’s been a whole lot of fun, and I wanted to share my experiences creating woodcut data visualizations using just D3. I love it when data visualisations break out of the glass rectangle.
  4. Why is Concurrent Programming Hard?on the one hand there is not a single concurrency abstraction that fits all problems, and on the other hand the various different abstractions are rarely designed to be used in combination with each other. We are due for a revolution in programming, something to help us make sense of the modern systems made of more moving parts than our feeble grey matter can model and intuit about.
Comment: 1
Four short links: 28 January 2015

Four short links: 28 January 2015

Note and Vote, Gaming Behaviour, Code Search, and Immutabilate All The Things

  1. Note and Vote (Google Ventures) — nifty meeting hack to surface ideas and identify popular candidates to a decision maker.
  2. Applying Psychology to Improve Online Behaviour — online game runs massive experiments (w/researchers to validate findings) to improve the behaviour of their players. Some of Riot’s experiments are causing the game to evolve. For example, one product is a restricted chat mode that limits the number of messages abusive players can type per match. It’s a temporary punishment that has led to a noticeable improvement in player behavior afterward —on average, individuals who went through a period of restricted chat saw 20 percent fewer abuse reports filed by other players. The restricted chat approach also proved 4 percent more effective at improving player behavior than the usual punishment method of temporarily banning toxic players. Even the smallest improvements in player behavior can make a huge difference in an online game that attracts 67 million players every month.
  3. Hound — open source code search tool from Etsy.
  4. Immutability Changes Everything (PDF) — This paper is simply an amuse-bouche on the repeated patterns of computing that leverage immutability.
Comment
Four short links: 23 January 2015

Four short links: 23 January 2015

Investment Themes, Python Web Mining, Code Review, and Sexist Brilliance

  1. 16 Andreessen-Horowitz Investment Areas — I’m struck by how they’re connected: there’s a cluster around cloud development, there are two maybe three on sensors …
  2. Patterna web mining module for the Python programming language. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing (part-of-speech taggers, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, clustering, SVM), network analysis and <canvas> visualization.
  3. Code Review — FogCreek’s code review checklist.
  4. Expectations of Brilliance Underlie Gender Distributions Across Academic Disciplines (Science) — Surveys revealed that some fields are believed to require attributes such as brilliance and genius, whereas other fields are believed to require more empathy or hard work. In fields where people thought that raw talent was required, academic departments had lower percentages of women. (via WaPo)
Comment
Four short links: 7 January 2015

Four short links: 7 January 2015

Program Synthesis, Data Culture, Metrics, and Information Biology

  1. Program Synthesis ExplainedThe promise of program synthesis is that programmers can stop telling computers how to do things, and focus instead on telling them what they want to do. Inductive program synthesis tackles this problem with fairly vague specifications and, although many of the algorithms seem intractable, in practice they work remarkably well.
  2. Creating a Data-Driven Culture — new (free!) ebook from Hilary Mason and DJ Patil. The editor of that team is the luckiest human being alive.
  3. Ev Williams on Metrics — a master-class in how to think about and measure what matters. If what you care about — or are trying to report on — is impact on the world, it all gets very slippery. You’re not measuring a rectangle, you’re measuring a multi-dimensional space. You have to accept that things are very imperfectly measured and just try to learn as much as you can from multiple metrics and anecdotes.
  4. Nature, the IT Wizard (Nautilus) — a fun walk through the connections between information theory, computation, and biology.
Comment