"machine learning" entries

Challenges facing predictive APIs

Solutions to a number of problems must be found to unlock PAPI value.

Key_in_Lock_nikolajnewyork_FlickrIn November, the first International Conference on Predictive APIs and Apps will take place in Barcelona, just ahead of Strata Barcelona. This event will bring together those who are building intelligent web services (sometimes called Machine Learning as a Service) with those who would like to use these services to build predictive apps, which, as defined by Forrester, deliver “the right functionality and content at the right time, for the right person, by continuously learning about them and predicting what they’ll need.”

This is a very exciting area. Machine learning of various sorts is revolutionizing many areas of business, and predictive services like the ones at the center of predictive APIs (PAPIs) have the potential to bring these capabilities to an even wider range of applications. I co-founded one of the first companies in this space (acquired by Salesforce in 2012), and I remain optimistic about the future of these efforts. But the field as a whole faces a number of challenges, for which the answers are neither easy nor obvious, that must be addressed before this value can be unlocked.

In the remainder of this post, I’ll enumerate what I see as the most pressing issues. I hope that the speakers and attendees at PAPIs will keep these in mind as they map out the road ahead. Read more…

Comment: 1
Four short links: 30 September 2014

Four short links: 30 September 2014

Continuous Testing, Programmable Bees, Deep Learning on GPUs, and Silk Road Numbers

  1. Continuously Testing Infrastructure — “infrastructure as code”. I can’t figure out whether what I feel are thrills or chills.
  2. Engineer Sees Big Possibilities in Micro-robots, Including Programmable Bees (National Geographic) — He and fellow researchers devised novel techniques to fabricate, assemble, and manufacture the miniature machines, each with a housefly-size thorax, three-centimeter (1.2-inch) wingspan, and weight of just 80 milligrams (.0028 ounces). The latest prototype rises on a thread-thin tether, flaps its wings 120 times a second, hovers, and flies along preprogrammed paths. (via BoingBoing)
  3. cuDNN — NVIDIA’s library of primitives for deep neural networks (on GPUS, natch). Not open source (registerware).
  4. Analysing Trends in Silk Road 2.0If, indeed every sale can map to a transaction, some vendors are doing huge amounts of business through mail order drugs. While the number is small, if we sum up all the product reviews x product prices, we get a huge number of USD $20,668,330.05. REMEMBER! This is on Silk Road 2.0 with a very small subset of their entire inventory. A peek into a largely invisible economy.
Comment
Four short links: 29 September 2014

Four short links: 29 September 2014

Feedback Surprises, Ownership Changes, Teaching Lessons, and 3D Retail

  1. How Community Feedback Shapes Behaviour (PDF) — Not only do authors of negatively-evaluated content contribute more, but also their future posts are of lower quality, and are perceived by the community as such. Moreover, these authors are more likely to subsequently evaluate their fellow users negatively, percolating these effects through the community. In contrast, positive feedback does not carry similar effects, and neither encourages rewarded authors to write more, nor improves the quality of their posts. Interestingly, the authors that receive no feedback are most likely to leave a community. Furthermore, a structural analysis of the voter network reveals that evaluations polarize the community the most when positive and negative votes are equally split.
  2. When Everything Works Like Your Cell Phone (The Atlantic) — our relationship to ownership is about to undergo a wild transformation.
  3. Teaching Me Softly — article of anecdotes drawing parallels between case studies in machine learning and things we know about human learning.
  4. SuperAwesome Me (3D Print) — Walmart to install 3d scanning booths and 3d printers so you can put your own head on a Hasbro action figure. Hasbro have the religion: they also paired with Shapeways for superfanart.com. (via John Battelle)
Comment
Four short links: 26 September 2014

Four short links: 26 September 2014

Good Communities, AI Games, Design Process, and Web Server Library

  1. 15 Lessons from 15 Years of Blogging (Anil Dash) — If your comments are full of assholes, it’s your fault. Good communities don’t just happen by accident.
  2. Replicating DeepMind — open source attempt to build deep learning network that can play Atari games. (via RoboHub)
  3. ToyTalk — fantastic iterative design process for the product (see the heading “A Bit of Trickery”)
  4. h2oan optimized HTTP server implementation that can be used either as a standalone server or a library.
Comment
Four short links: 19 September 2014

Four short links: 19 September 2014

Deep Learning Bibliography, Go Playground, Tweet-a-Program, and Memory Management

  1. Deep Learning Bibliographyan annotated bibliography of recent publications (2014-) related to Deep Learning.
  2. Inside the Go Playground — on safely offering a REPL over the web to strangers.
  3. Wolfram Tweet-a-Program — clever marketing trick, and reminiscent of Perl Golf-style “how much can you fit into how little” contests.
  4. Memory Management Reference — almost all you ever wanted to know about memory management.
Comment
Four short links: 15 September 2014

Four short links: 15 September 2014

Weird Machines, Libraries May Scan, Causal Effects, and Crappy Dashboards

  1. The Care and Feeding of Weird Machines Found in Executable Metadata (YouTube) — talk from 29th Chaos Communication Congress, on using tricking the ELF linker/loader into arbitrary computation from the metadata supplied. Yes, there’s a brainfuck compiler that turns code into metadata which is then, through a supernatural mix of pixies, steam engines, and binary, executed. This will make your brain leak. Weird machines are everywhere.
  2. European Libraries May Digitise Books Without Permission“The right of libraries to communicate, by dedicated terminals, the works they hold in their collections would risk being rendered largely meaningless, or indeed ineffective, if they did not have an ancillary right to digitize the works in question,” the court said. Even if the rights holder offers a library the possibility of licensing his works on appropriate terms, the library can use the exception to publish works on electronic terminals, the court ruled. “Otherwise, the library could not realize its core mission or promote the public interest in promoting research and private study,” it said.
  3. CausalImpact (GitHub) — Google’s R package for estimating the causal effect of a designed intervention on a time series. (via Google Open Source Blog)
  4. Laws of Crappy Dashboards — (caution, NSFW language … “crappy” is my paraphrase) so true. Not talking to users will result in a [crappy] dashboard. You don’t know if the dashboard is going to be useful. But you don’t talk to the users to figure it out. Or you just show it to them for a minute (with someone else’s data), never giving them a chance to figure out what the hell they could do with it if you gave it to them.
Comment: 1
Four short links: 12 September 2014

Four short links: 12 September 2014

Knowledge Graphs, Multi-Language Declarations, Monitoring, and More Monitoring

  1. Google Knowledge Vault and Topic Modeling — recap of talks by Google and Facebook staff about how they use their knowledge graphs. I found this super-interesting.
  2. djinniA tool for generating cross-language type declarations and interface bindings.
  3. monita small Open Source utility for managing and monitoring Unix systems. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations.
  4. perf-toolingList of performance analysis, monitoring and optimization tools.
Comments: 3
Four short links: 1 September 2014

Four short links: 1 September 2014

Sibyl, Bitrot, Estimation, and ssh

  1. Sibyl: Google’s System for Large Scale Machine Learning (YouTube) — keynote at DSN2014 acting as an intro to Sibyl. (via KD Nuggets)
  2. Bitrot from 1997That’s 205 failures, an actual link rot figure of 91%, not 57%. That leaves only 21 URLs as 200 OK and containing effectively the same content.
  3. What We Do And Don’t Know About Software Effort Estimation — nice rundown of research in the field.
  4. fabric — simple yet powerful ssh library for Python.
Comment: 1
Four short links: 27 August 2014

Four short links: 27 August 2014

Discourse 1.0, Programmable Matter, Versioned Databases, and What Humans Learned About Machine Learning

  1. Discourse turns 1.0 — community/forum software that doesn’t suck.
  2. Programmable Matter (IEEE Spectrum) — recap of where research is going in this area.
  3. Liquibasesource control for your database. Apache 2.0 licensed.
  4. A Few Useful Things to Know About Machine Learning (PDF) — This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions. My fave: First-timers are often surprised by how little time in a machine learning project is spent actually doing machine learning. But it makes sense if you consider how time-consuming it is to gather data, integrate it, clean it and pre-process it, and how much trial and error can go into feature design.
Comments: 2
Four short links: 20 August 2014

Four short links: 20 August 2014

Plant Properties, MQ Comparisons, 1915 Vis, and Mobile Web Weaknesses

  1. Machine Learning for Plant Properties — startup building database of plant genomics, properties, research, etc. for mining. The more familiar you are with your data and its meaning, the better your machine learning will be at suggesting fruitful lines of query … and the more valuable your startup will be.
  2. Dissecting Message Queues — throughput, latency, and qualitative comparison of different message queues. MQs are to modern distributed architectures what function calls were to historic unibox architectures.
  3. 1915 Data Visualization Rules — a reminder that data visualization is not new, but research into effectiveness of alternative presentation styles is.
  4. The Broken Promise of the Mobile Webit’s not just about the UI – it’s also about integration with the mobile device.
Comment