- Inside Data Brokers — very readable explanation of the data brokers and how their information is used to track advertising effectiveness.
- Elon, I Want My Data! — Telsa don’t give you access to the data that your cars collects. Bodes poorly for the Internet of Sealed Boxes. (via BoingBoing)
- Pattern Classification (Github) — collection of tutorials and examples for solving and understanding machine learning and pattern classification tasks.
- HOGWILD! (PDF) — the algorithm that Microsoft credit with the success of their Adam deep learning system.
ENTRIES TAGGED "deep learning"
Step-by-step instruction on training your own neural network.
When I first became interested in using deep learning for computer vision I found it hard to get started. There were only a couple of open source projects available, they had little documentation, were very experimental, and relied on a lot of tricky-to-install dependencies. A lot of new projects have appeared since, but they’re still aimed at vision researchers, so you’ll still hit a lot of the same obstacles if you’re approaching them from outside the field.
In this article — and the accompanying webcast — I’m going to show you how to run a pre-built network, and then take you through the steps of training your own. I’ve listed the steps I followed to set up everything toward the end of the article, but because the process is so involved, I recommend you download a Vagrant virtual machine that I’ve pre-loaded with everything you need. This VM lets us skip over all the installation headaches and focus on building and running the neural networks. Read more…
Data Brokers, Car Data, Pattern Classification, and Hogwild Deep Learning
Announcing a new series delving into deep learning and the inner workings of neural networks.
Editor’s note: this post is part of our Intelligence Matters investigation.
When I first ran across the results in the Kaggle image-recognition competitions, I didn’t believe them. I’ve spent years working with machine vision, and the reported accuracy on tricky tasks like distinguishing dogs from cats was beyond anything I’d seen, or imagined I’d see anytime soon. To understand more, I reached out to one of the competitors, Daniel Nouri, and he demonstrated how he used the Decaf open-source project to do so well. Even better, he showed me how he was quickly able to apply it to a whole bunch of other image-recognition problems we had at Jetpac, and produce much better results than my conventional methods.
I’ve never encountered such a big improvement from a technique that was largely unheard of just a couple of years before, so I became obsessed with understanding more. To be able to use it commercially across hundreds of millions of photos, I built my own specialized library to efficiently run prediction on clusters of low-end machines and embedded devices, and I also spent months learning the dark arts of training neural networks. Now I’m keen to share some of what I’ve found, so if you’re curious about what on earth deep learning is, and how it might help you, I’ll be covering the basics in a series of blog posts here on Radar, and in a short upcoming ebook. Read more…
Scanner Malware, Cognitive Biases, Deep Learning, and Community Metrics
- Handheld Scanners Attack — shipping and logistics operations compromised by handheld scanners running malware-infested Windows XP.
- Adventures in Cognitive Biases (MIT) — web adventure to build your cognitive defences against biases.
- Quoc Le’s Lectures on Deep Learning — Machine Learning Summer School videos (4k!) of the deep learning lectures by Google Brain team member Quoc Le.
- FLOSS Community Metrics Talks — upcoming event at Puppet Labs in Portland. I hope they publish slides and video!
Trusting Code, Deep Pi, Docker DevOps, and Secure Database
- Trusting Browser Code (Tim Bray) — on the fundamental weakness of the ‘net as manifest in the browser.
- Deep Learning in the Raspberry Pi (Pete Warden) — $30 now gets you a computer you can run deep learning algorithms on. Awesome.
- Announcing Docker Hub and Official Repositories — as Docker went 1.0 and people rave about how they use it, comes this. They’re thinking hard about “integrating into the build ship run loop”, which aligns well with DevOps-enabling tool use.
- Apple’s Secure Database for Users (Ian Waring) — excellent breakdown of how Apple have gone out of their way to make their cloud database product safe and robust. They may be slow to “the cloud” but they have decades of experience having users as customers instead of products.
Machine Learning, Deep Learning, Sewing Machines & 3D Printers, and Smart Spoons
- Basics of Machine Learning Course Notes — slides and audio from university course. Watch along on YouTube.
- A Primer on Deep Learning — a very quick catch-up on WTF this is all about.
- 3D Printers Have a Lot to Learn from Sewing Machines — Sewing does not create more waste but, potentially, less, and the process of sewing is filled with opportunities for increasing one’s skills and doing it over as well as doing it yourself. What are quilts, after all, but a clever way to use every last scrap of precious fabric? (via Jenn Webb)
- Liftware — Parkinson’s-correcting spoons.
Internet of Listeners, Mobile Deep Belief, Crowdsourced Spectrum Data, and Quantum Minecraft
- Jasper Project — an open source platform for developing always-on, voice-controlled applications. Shouting is the new swiping—I eagerly await Gartner touting the Internet-of-things-that-misunderstand-you.
- DeepBeliefSDK — deep neural network library for iOS. (via Pete Warden)
- Microsoft Spectrum Observatory — crowdsourcing spectrum utilisation information. Just open sourced their code.
- qcraft — beginner’s guide to quantum physics in Minecraft. (via Nelson Minar)
Google Flu, Embeddable JS, Data Analysis, and Belief in the Browser
- The Parable of Google Flu (PDF) — We explore two
issues that contributed to [Google Flu Trends]’s mistakes—big data hubris and algorithm dynamics—and offer lessons for moving forward in the big data age. Overtrained and underfed?
- Principles of Good Data Analysis (Greg Reda) — Once you’ve settled on your approach and data sources, you need to make sure you understand how the data was generated or captured, especially if you are using your own company’s data. Treble so if you are using data you snaffled off the net, riddled with collection bias and untold omissions. (via Stijn Debrouwere)
Flexible Data, Google's Bottery, GPU Assist Deep Learning, and Open Sourcing
- Google’s Seven Robotics Companies (IEEE) — The seven companies are capable of creating technologies needed to build a mobile, dexterous robot. Mr. Rubin said he was pursuing additional acquisitions. Rundown of those seven companies.
- Hebel (Github) — GPU-Accelerated Deep Learning Library in Python.
- What We Learned Open Sourcing — my eye was caught by the way they offered APIs to closed source code, found and solved performance problems, then open sourced the fixed code.