"machine learning" entries

Four short links: 13 November 2015

Four short links: 13 November 2015

CEO Optimism, Fibbing Networking, GPU TensorFlow, and GUI Font Design

  1. CEO OptimismCEOs always act on leading indicators of good news, but only act on lagging indicators of bad news. (Andy Grove)
  2. Fibbing — lie to your router table to get the most from your network. Clever!
  3. TensorFlow for GPUs — Amazon image of TensorFlow ready to run on their GPU compute cloud.
  4. metaflop — UI for metafont that makes it super-easy to design your own sweet-looking font. (via BoingBoing)
Four short links: 10 November 2015

Four short links: 10 November 2015

TensorFlow Released, TensorFlow Described, Neural Networks Optimized, Cybersecurity as RealPolitik

  1. TensorFlow — Google released, as open source, their distributed machine learning system. The DataFlow programming framework is sweet, and the documentation is gorgeous. AMAZINGLY high-quality, sets the bar for any project. This may be 2015’s most important software release.
  2. TensorFlow White Paper (PDF) — Compared to DistBelief [G’s first scalable distributed inference and training system], TensorFlow’s programming model is more flexible, its performance is significantly better, and it supports training and using a broader range of models on a wider variety of heterogeneous hardware platforms.
  3. Neural Networks With Few Multiplications — paper with a method to eliminate most of the time-consuming floating point multiplications needed to update the intermediate virtual neurons as they learn. Speed has been one of the bugbears of deep neural networks.
  4. Cybersecurity as RealPolitik — Dan Geer’s excellent talk from 2014 BlackHat. When younger people ask my advice on what they should do or study to make a career in cyber security, I can only advise specialization. Those of us who were in the game early enough and who have managed to retain an over-arching generalist knowledge can’t be replaced very easily because while absorbing most new information most of the time may have been possible when we began practice, no person starting from scratch can do that now. Serial specialization is now all that can be done in any practical way. Just looking at the Black Hat program will confirm that being really good at any one of the many topics presented here all but requires shutting out the demands of being good at any others.
Four short links: 9 November 2015

Four short links: 9 November 2015

Smart Sensors, Learning Autopilot, Higher Education, and 3D Soccer

  1. Low-Power Deep Learning — it’s a media release for proprietary tech, but interesting that people are working on low-power deep learning neural nets. As Pete Warden noted, this kind of research will be at the center of smart sensors. (via Pete Warden)
  2. Tesla’s Self-Improving Autopilot — it learns when you “rescue” (aka take control back from autopilot), so it’s getting better day by day. Musk said that Model S owners could add ~1 million miles of new data every day, which is helping the company create “high-precision maps.” Navteq, Google Maps, Waze … new map data is still valuable.
  3. The Digital Revolution in Higher Education Has Already Happened (Clay Shirky) — and no-one noticed. I read half of this before going “holy crap this is good, who wrote it?” I’m a Shirky junkie (I bet his laundry lists cite Habermas and the Peace of Westphalia). At the current rate of growth, half the country’s undergraduates will have at least one online class on their transcripts by the end of the decade. This is the new normal. But, As long as we discuss online education as a pedagogic revolution rather than an organizational one, we aren’t even having the right kind of conversation. The dramatic adoption of online education is not mainly a change in the content of classes. It’s a change in the institutional form of college, a demand for more flexibility by students who have to manage the increasingly complicated triangle of work, family, and school.
  4. System Automatically Converts 2-D to 3-D (MIT) — hilarious strategy! They constrained their domain: broadcast soccer games. The MIT and QCRI researchers essentially ran this process in reverse. They set the very realistic Microsoft soccer game “FIFA13” to play over and over again, and used Microsoft’s video-game analysis tool PIX to continuously store screen shots of the action. For each screen shot, they also extracted the corresponding 3-D map. […] For every frame of 2-D video of an actual soccer game, the system looks for the 10 or so screen shots in the database that best correspond to it. Then it decomposes all those images, looking for the best matches between smaller regions of the video feed and smaller regions of the screen shots. Once it’s found those matches, it superimposes the depth information from the screen shots on the corresponding sections of the video feed. Finally, it stitches the pieces back together. Brute-forcing soccer. Ok, perhaps “hilarious” for a certain type of person. I am that person.
Four short links: 4 November 2015

Four short links: 4 November 2015

Data Dashboard, Feature Flags, Email Replies, and Invisible Bias

  1. re:dash — open source query editor, visualisations, dashboard for data from all sorts of databases (SQL, ElasticSearch, etc.)
  2. Feature-Flag-Driven Development — one of the key pieces of modern development systems.
  3. Gmail Suggesting RepliesIn developing Smart Reply, we adhered to the same rigorous user privacy standards we’ve always held — in other words, no humans reading your email. This means researchers have to get machine learning to work on a data set that they themselves cannot read, which is a little like trying to solve a puzzle while blindfolded — but a challenge makes it more interesting!
  4. The Selective Laziness of ReasoningAmong those participants who accepted the manipulation and thus thought they were evaluating someone else’s argument, more than half (56% and 58%) rejected the arguments that were in fact their own. Moreover, participants were more likely to reject their own arguments for invalid than for valid answers. This demonstrates that people are more critical of other people’s arguments than of their own, without being overly critical: They are better able to tell valid from invalid arguments when the arguments are someone else’s rather than their own.
Four short links: 2 November 2015

Four short links: 2 November 2015

Anti-Caching, Tyranny of Ratings, Distributed Deep Learning, and Sorting Rated Things

  1. Anti-Caching (PDF) — paper outlining a clever reframing of the database strategy of keeping frequently accessed things in-memory, namely pushing to disk the things that won’t be accessed … aka, “anti-caching.”
  2. The Rating Game (Verge) — Until companies release ratings data, we can’t know for certain whether this is true, but a study of Airbnb users found that black hosts get less money for similar listings than white hosts, and another study found that white taxi drivers get higher tips than black ones. There’s no reason such biases wouldn’t carry over to ratings.
  3. Singa — Apache distributed deep learning platform turns 1.0.
  4. Scoring Items That Were Voted On or Rated — a Bayesian system to turn a set of ratings or up/down votes into a single score, such that you can sort a list from “best” to “worst.”
Four short links: 27 October 2015

Four short links: 27 October 2015

Learning Neural Nets, Medium's Stack, Bacterial Materials, and Drone Data

  1. What a Deep Neural Net Thinks of Your Selfie — really easy to understand explanation of covolutional neural nets (the tech behind image recognition). No CS required.
  2. Medium’s Stack — interesting use of Protocol Buffers: We help our people work with data by treating the schemas as the spec, rigorously documenting messages and fields and publishing generated documentation from the .proto files.
  3. Bacterial Materials (Wired UK) — Showing a prototype worn by dancers, Yao demonstrated how bacteria-powered clothing can respond to the body’s needs. She has, in effect, created living clothes, ones that react in real time to heat and sweat mapping with tiny vents that would curl open or flatten closed as exertion levels demanded.
  4. Robots to the Rescue (NSF) — one 20-minute drone flight generated upwards of 800 photographs, each of which took at least one minute to inspect. This article is five lessons learned in the field of disaster robotics, and they’re all doozies.

Get started with cloud-based data science

Learn how to deploy machine learning solutions using Azure ML.

620px-MODIS_Map

Download the free, updated report “Data Science in the Cloud with Microsoft Azure Machine Learning and R: 2015 Update.

Cloud-based machine learning platforms, like Microsoft’s Azure Machine Learning (Azure ML), provide a simplified path to create and deploy analytic solutions. Azure ML is a fully managed and secure machine learning platform that resides within the Microsoft Cortana Analytics Suite.

Azure ML workflows (known as “experiments”) are constructed using a combination of drag-and-drop modules, SQL, R, and Python scripts. The wide range of built modules support the typical steps in a machine learning workflow, from data ingestion and data munging to model construction and cross validation.

Once your Azure ML experiment is ready, there are several options to deploy it. Azure ML experiments can access large-scale data stored in Azure Blob storage, Azure SQL and Hive, to name a few options. Similarly, your experiment can write results back to multiple scalable Azure storage options.

Read more…

Four short links: 16 October 2015

Four short links: 16 October 2015

Tesla Update, Final Feltron, Mined Medicine, and Dodgy Drone Program

  1. Tesla’s Cars Drive Themselves, Kinda (Wired) — over-the-air software update just made existing cars massively more awesome. Sometimes knowing how they did it doesn’t make it feel any less like magic.
  2. Felton’s Last Report — ten years of quantified self. See Fast Company for more.
  3. Spinal Cord Injury Breakthrough by SoftwareThis wasn’t the result of a new, long-term study, but a meta-analysis of $60 million worth of basic research written off as useless 20 years ago by a team of neuroscientists and statisticians led by the University of California San Francisco and partnering with the software firm Ayasdi, using mathematical and machine learning techniques that hadn’t been invented yet when the trials took place.
  4. The Assassination Complex (The Intercept) — America’s drone program’s weaknesses highlighted in new document dump: Taken together, the secret documents lead to the conclusion that Washington’s 14-year high-value targeting campaign suffers from an overreliance on signals intelligence, an apparently incalculable civilian toll, and — due to a preference for assassination rather than capture — an inability to extract potentially valuable intelligence from terror suspects.
Four short links: 12 October 2015

Four short links: 12 October 2015

Unattended Robots, Replicable Economics, Deep Learning Learnings, and TPP Problems

  1. Acquiring Object Experiences at Scale — software to let a robot examine a pile of objects, unattended overnight.
  2. Economics Apparently Not Replicable (PDF) — We successfully replicate the key qualitative result of 22 of 67 papers (33%) without contacting the authors. Excluding the six papers that use confidential data and the two papers that use software we do not possess, we replicate 29 of 59 papers (49%) with assistance from the authors. Because we are able to replicate less than half of the papers in our sample even with help from the authors, we assert that economics research is usually not replicable.
  3. 26 Things I Learned in the Deep Learning Summer School20. When Frederick Jelinek and his team at IBM submitted one of the first papers on statistical machine translation to COLING in 1988, they got the following anonymous review: The validity of a statistical (information theoretic) approach to MT has indeed been recognized, as the authors mention, by Weaver as early as 1949. And was universally recognized as mistaken by 1950 (cf. Hutchins, MT – Past, Present, Future, Ellis Horwood, 1986, p. 30ff and references therein). The crude force of computers is not science. The paper is simply beyond the scope of COLING.
  4. The Final Leaked TPP Text is All That We Feared (EFF) — If you dig deeper, you’ll notice that all of the provisions that recognize the rights of the public are non-binding, whereas almost everything that benefits rightsholders is binding.

Movement data is going to transform everything

The O'Reilly Radar Podcast: Rajiv Maheswaran on the science of moving dots, and Claudia Perlich on big data in advertising.

620px-Elephantsdream_vectorstill06

Subscribe to the O’Reilly Radar Podcast to track the technologies and people that will shape our world in the years to come.

In this week’s Radar Podcast episode, O’Reilly’s Mac Slocum chats with Rajiv Maheswaran, CEO of Second Spectrum. Maheswaran talks about machine learning applications in sports, the importance of context in measuring stats, and the future of real-time, in-game analytics.

Here are some highlights from their chat:

There’s a lot of parts of the game of basketball — pick and rolls, dribble hand-offs — that coaches really care about, about analyzing how it works on offense, how to guard them. Before big data and machine learning, people basically watched the games and marked them. It turns out that people are pretty bad at marking them accurately, and they also miss a ton of stuff. Right now, machine learning tells coaches, ‘This is how many pick and rolls these two players have had over the course of the season, how often they do all the different variations, what they’re good at, what they’re bad at.’ Coaches can really find tendencies that can help them play offense, play defense, far more efficiently, based off of machine learning.

What we’re doing is having the machine match human intuition. If I’m watching a game, I know that the shot is harder if I’m farther away, if I have multiple defenders, if they’re close, if they’re closing in on me, if I’m dribbling, the type of shot I’m taking. As a human, I watch this and I have an intuition about it. Now, by giving all that data to the machine, it can make a predictor that actually matches our intuition, and goes beyond it because it can put a number onto what our intuition tells us.

Read more…