"deep learning" entries

Bridging the divide: Business users and machine learning experts

The O'Reilly Data Show Podcast: Alice Zheng on feature representations, model evaluation, and machine learning models.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science.

606px-IBM_Electronic_Data_Processing_Machine_-_GPN-2000-001881As tools for advanced analytics become more accessible, data scientist’s roles will evolve. Most media stories emphasize a need for expertise in algorithms and quantitative techniques (machine learning, statistics, probability), and yet the reality is that expertise in advanced algorithms is just one aspect of industrial data science.

During the latest episode of the O’Reilly Data Show podcast, I sat down with Alice Zheng, one of Strata + Hadoop World’s most popular speakers. She has a gift for explaining complex topics to a broad audience, through presentations and in writing. We talked about her background, techniques for evaluating machine learning models, how much math data scientists need to know, and the art of interacting with business users.

Making machine learning accessible

People who work at getting analytics adopted and deployed learn early on the importance of working with domain/business experts. As excited as I am about the growing number of tools that open up analytics to business users, the interplay between data experts (data scientists, data engineers) and domain experts remains important. In fact, human-in-the-loop systems are being used in many critical data pipelines. Zheng recounts her experience working with business analysts:

It’s not enough to tell someone, “This is done by boosted decision trees, and that’s the best classification algorithm, so just trust me, it works.” As a builder of these applications, you need to understand what the algorithm is doing in order to make it better. As a user who ultimately consumes the results, it can be really frustrating to not understand how they were produced. When we worked with analysts in Windows or in Bing, we were analyzing computer system logs. That’s very difficult for a human being to understand. We definitely had to work with the experts who understood the semantics of the logs in order to make progress. They had to understand what the machine learning algorithms were doing in order to provide useful feedback. Read more…

Comments: 4

Unsupervised learning, attention, and other mysteries

How to almost necessarily succeed: An interview with Google research scientist Ilya Sutskever.

Get notified when our free report “Future of Machine Intelligence: Perspectives from Leading Practitioners” is available for download. The following interview is one of many that will be included in the report.

633px-Jan_Steen_-_A_School_for_Boys_and_Girls_-_Google_Art_ProjectIlya Sutskever is a research scientist at Google and the author of numerous publications on neural networks and related topics. Sutskever is a co-founder of DNNresearch and was named Canada’s first Google Fellow.

Key Takeaways:

  1. Since humans can solve perception problems very quickly, despite our neurons being relatively slow, moderately deep and large neural networks have enabled machines to succeed in a similar fashion.
  2. Unsupervised learning is still a mystery, but a full understanding of that domain has the potential to fundamentally transform the field of machine learning.
  3. Attention models represent a promising direction for powerful learning algorithms that require ever less data to be successful on harder problems.

David Beyer: Let’s start with your background. What was the evolution of your interest in machine learning, and how did you zero-in on your Ph.D. work?

Ilya Sutskever: I started my Ph.D. just before deep learning became a thing. I was working on a number of different projects, mostly centered around neural networks. My understanding of the field crystallized when collaborating with James Martins on the Hessian-free optimizer. At the time, greedy layer-wise training (training one layer at a time) was extremely popular. Working on the Hessian-free optimizer helped me understand that if you just train a very large and deep neural network on a lot of data, you will almost necessarily succeed. Read more…

Comment: 1

Building intelligent machines

To understand deep learning, let’s start simple.


Use code DATA50 to get 50% off of the new early release of “Fundamentals of Deep Learning: Designing Next-Generation Artificial Intelligence Algorithms.” Editor’s note: This is an excerpt of “Fundamentals of Deep Learning,” by Nikhil Buduma.

The brain is the most incredible organ in the human body. It dictates the way we perceive every sight, sound, smell, taste, and touch. It enables us to store memories, experience emotions, and even dream. Without it, we would be primitive organisms, incapable of anything other than the simplest of reflexes. The brain is, inherently, what makes us intelligent.

The infant brain only weighs a single pound, but somehow, it solves problems that even our biggest, most powerful supercomputers find impossible. Within a matter of days after birth, infants can recognize the faces of their parents, discern discrete objects from their backgrounds, and even tell apart voices. Within a year, they’ve already developed an intuition for natural physics, can track objects even when they become partially or completely blocked, and can associate sounds with specific meanings. And by early childhood, they have a sophisticated understanding of grammar and thousands of words in their vocabularies.

For decades, we’ve dreamed of building intelligent machines with brains like ours — robotic assistants to clean our homes, cars that drive themselves, microscopes that automatically detect diseases. But building these artificially intelligent machines requires us to solve some of the most complex computational problems we have ever grappled with, problems that our brains can already solve in a manner of microseconds. To tackle these problems, we’ll have to develop a radically different way of programming a computer using techniques largely developed over the past decade. This is an extremely active field of artificial computer intelligence often referred to as deep learning. Read more…

Comments: 3
Four short links: 6 July 2015

Four short links: 6 July 2015

DeepDream, In-Flight WiFi, Computer Vision in Preservation, and Testing Distributed Systems

  1. DeepDream — the software that’s been giving the Internet acid-free trips.
  2. In-Flight WiFi Business — numbers and context for why some airlines (JetBlue) have fast free in-flight wifi while others (Delta) have pricey slow in-flight wifi. Four years ago ViaSat-1 went into geostationary orbit, putting all other broadband satellites to shame with 140 Gbps of total capacity. This is the Ka-band satellite that JetBlue’s fleet connects to, and while the airline has to share that bandwidth with homes across of North America that subscribe to ViaSat’s Excede residential broadband service, it faces no shortage of capacity. That’s why JetBlue is able to deliver 10-15 Mbps speeds to its passengers.
  3. British Library Digitising Newspapers (The Guardian) — as well as photogrammetry methods used in the Great Parchment Book project, Terras and colleagues are exploring the potential of a host of techniques, including multispectral imaging (MSI). Inks, pencil marks, and paper all reflect, absorb, or emit particular wavelengths of light, ranging from the infrared end of the electromagnetic spectrum, through the visible region and into the UV. By taking photographs using different light sources and filters, it is possible to generate a suite of images. “We get back this stack of about 40 images of the [document] and then we can use image-processing to try to see what is in [some of them] and not others,” Terras explains.
  4. Testing a Distributed System (ACM) — This article discusses general strategies for testing distributed systems as well as specific strategies for testing distributed data storage systems.
Four short links: 18 July 2015

Four short links: 18 July 2015

WebAssembly, Generative Neural Nets, Automated Workplace, and Conversational UIs

  1. WebAssembly (Luke Wagner) — new standard, WebAssembly, that defines a portable, size- and load-time-efficient format and execution model specifically designed to serve as a compilation target for the Web. Being worked on by Mozilla, Google, Microsoft, and Apple.
  2. Inceptionism: Going Deeper into Neural Networks (Google Research) — stunningly gorgeous gallery of images made by using a deep image-classification neural net to make the picture “more.” (So, if the classifier says the pic is of a cat, randomly twiddle pixels until the image classifier says “wow, that matches `cat’ even better!”)
  3. The Automated Workplace (Ben Brown) — What happens if this process is automated using a “bot” in an environment like Slack? — repeat for all business processes. (via Matt Webb)
  4. Conversational UIs (Matt Webb) — a new medium needs a new grammar and conversational UIs are definitely a new medium. As someone whose wedding vows were exchanged on a TinyMUSH, conversational UIs are near and dear to my heart.
Four short links: 15 June 2015

Four short links: 15 June 2015

Streams at Scale, Molecular Programming, Formal Verification, and Deep Learning's Flaws

  1. Twitter Heron: Stream Processing at Scale (Paper a Day) — very readable summary of Apache Storm’s failings, and Heron’s improvements.
  2. Molecular Programming Projectaims to develop computer science principles for programming information-bearing molecules like DNA and RNA to create artificial biomolecular programs of similar complexity. Our long-term vision is to establish molecular programming as a subdiscipline of computer science — one that will enable a yet-to-be imagined array of applications from chemical circuitry for interacting with biological molecules to nanoscale computing and molecular robotics.
  3. The Software Analysis Workbenchprovides the ability to formally verify properties of code written in C, Java, and Cryptol. It leverages automated SAT and SMT solvers to make this process as automated as possible, and provides a scripting language, called SAW Script, to enable verification to scale up to more complex systems. “Non-commercial” license.
  4. What’s Wrong with Deep Learning? (PDF in Google Drive) — What’s missing from deep learning? 1. Theory; 2. Reasoning, structured prediction; 3. Memory, short-term/working/episodic memory; 4. Unsupervised learning that actually works. … and then ways to get those things. Caution: math ahead.
Comment: 1
Four short links: 27 May 2015

Four short links: 27 May 2015

Domo Arigato Mr Google, Distributed Graph Processing, Experiencing Ethics, and Deep Learning Robots

  1. Roboto — Google’s signature font is open sourced (Apache 2.0), including the toolchain to build it.
  2. Pregel: A System for Large Scale Graph Processing — a walk through a key 2010 paper from Google, on the distributed graph system that is the inspiration for Apache Giraph and which sits under PageRank.
  3. How to Turn a Liberal Hipster into a Global Capitalist (The Guardian) — In Zoe Svendsen’s play “World Factory at the Young Vic,” the audience becomes the cast. Sixteen teams sit around factory desks playing out a carefully constructed game that requires you to run a clothing factory in China. How to deal with a troublemaker? How to dupe the buyers from ethical retail brands? What to do about the ever-present problem of clients that do not pay? […] And because the theatre captures data on every choice by every team, for every performance, I know we were not alone. The aggregated flowchart reveals that every audience, on every night, veers toward money and away from ethics. I’m a firm believer that games can give you visceral experience, not merely intellectual knowledge, of an activity. Interesting to see it applied so effectively to business.
  4. End to End Training of Deep Visuomotor Policies (PDF) — paper on using deep learning to teach robots how to manipulate objects, by example.
Four short links: 25 May 2015

Four short links: 25 May 2015

8 (Bits) Is Enough, Second Machine Age, LLVM OpenMP, and Javascript Graphs

  1. Why Are Eight Bits Enough for Deep Neural Networks? (Pete Warden) — It turns out that neural networks are different. You can run them with eight-bit parameters and intermediate buffers, and suffer no noticeable loss in the final results. This was astonishing to me, but it’s something that’s been re-discovered over and over again.
  2. The Great Decoupling (HBR) — The Second Machine Age is playing out differently than the First Machine Age, continuing the long-term trend of material abundance but not of ever-greater labor demand.
  3. OpenMP Support in LLVMOpenMP enables Clang users to harness full power of modern multi-core processors with vector units. Pragmas from OpenMP 3.1 provide an industry standard way to employ task parallelism, while ‘#pragma omp simd’ is a simple yet flexible way to enable data parallelism (aka vectorization).
  4. JS Graphs — a visual catalogue (with search) of Javascript graphing libraries.
Four short links: 18 March 2015

Four short links: 18 March 2015

Moonshots, Decacorns, Leadership, and Deep Learning

  1. How to Make Moonshots (Astro Teller) — Expecting a person to be a reliable backup for the [self-driving car] system was a fallacy. Once people trust the system, they trust it. Our success was itself a failure. We came quickly to the conclusion that we needed to make it clear to ourselves that the human was not a reliable backup — the car had to always be able to handle the situation. And the best way to make that clear was to design a car with no steering wheel — a car that could drive itself all of the time, from point A to point B, at the push of a button.
  2. Billion-Dollar Math (Bloomberg) — There’s a new buzzword, “decacorn,” for those over $10 billion, which includes Airbnb, Dropbox, Pinterest, Snapchat, and Uber. It’s a made-up word based on a creature that doesn’t exist. “If you wake up in a room full of unicorns, you are dreaming,” Todd Dagres, a founding partner at Spark Capital, recently told Bloomberg News. Not just cute seeing our industry explained to the unwashed, but it’s the first time I’d seen decacorn. (The weather’s just dandy in my cave, thanks for asking).
  3. What Impactful Engineering Leadership Looks Like — aside from the ugliness of “impactful,” notable for good advice. “When engineering management is done right, you’re focusing on three big things,” she says. “You’re directly supporting the people on your team; you’re managing execution and coordination across teams; and you’re stepping back to observe and evolve the broader organization and its processes as it grows.”
  4. cxxnet“a fast, concise, distributed deep learning framework” that scales beyond a single GPU.
Four short links: 12 March 2015

Four short links: 12 March 2015

Billion Node Graphs, Asynchronous Systems, Deep Learning Hardware, and Vision Resources

  1. Mining Billion Node Graphs: Patterns and Scalable Algorithms (PDF) — slides from a CMU academic’s talk at C-BIG 2012.
  2. There Is No NowOne of the most important results in the theory of distributed systems is an impossibility result, showing one of the limits of the ability to build systems that work in a world where things can fail. This is generally referred to as the FLP result, named for its authors, Fischer, Lynch, and Paterson. Their work, which won the 2001 Dijkstra Prize for the most influential paper in distributed computing, showed conclusively that some computational problems that are achievable in a “synchronous” model in which hosts have identical or shared clocks are impossible under a weaker, asynchronous system model.
  3. Deep Learning Hardware GuideOne of the worst things you can do when building a deep learning system is to waste money on hardware that is unnecessary. Here I will guide you step by step through the hardware you will need for a cheap high performance system.
  4. Awesome Computer Vision — curated list of computer vision resources.