"machine learning" entries

Four short links: 22 April 2016

Unicorn Hazards Ahead, Brainprinting for Identity, Generating News Headlines, and Anthropic Capitalism

1. Why The Unicorn Financing Market Just Became Dangerous to Everyone — read with Fortune’s take on the Tech IPO Market. “They profess to take a long-term view, but the data shows post-IPO stocks are very volatile in the case of tech IPOs, and that is not a problem the underwriters try to address.” Damning breakdown of the current state. As Bryce said, Single-horned, majestic, Weapons of Mass Extraction.
2. Brainprints (Kurzweil) — 50 subjects, 500 images, EEG headset, 100% accuracy identifying person from their brain’s response to the images. We’ll need much larger studies, but this is promising.
3. Generating News Headlines with Recurrent Neural NetworksWe find that the model is quite effective at concisely paraphrasing news articles.
4. Anthropic Capitalism And The New Gimmick Economy — market capitalism struggles with “public goods” (those which are inexhaustible and non-excludable, like infinitely copyable bits that any number of people can have copies of at once), yet much of the world is being recast as an activity where software manipulates information, thus becoming a public good. Capitalism and Communism, which briefly resembled victor and vanquished, increasingly look more like Thelma and Louise; a tragic couple sent over the edge by forces beyond their control. What comes next is anyone’s guess and the world hangs in the balance.

Four short links: 20 April 2016

Explaining Classifier Predictions, Formatting Currency, Questioning Magic Leap, and Curing Slack Addiction

1. Why Should I Trust You?: Explaining the Predictions of Any Classifier (PDF) — LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction. Torkington’s Second Law: there’s no problem with machine learning that more machine learning can’t fix.
2. How Etsy Formats Currency — I’m saving this one because it chafes every time I do it, and I do it wrong every time.
3. Magic Leap in Wired — massive story by Kevin Kelly on the glories of Magic Leap, which The Verge noted still left a lot of open questions, such as “what the hell IS Magic Leap’s technology” and “why does everyone who works for Magic Leap sound like they’re on acid when they talk about the technology?” Everyone who wants their pixel-free glorious VR to be true is crossing fingers hoping it’s not another Theranos. The bit that stuck from the Wired piece was People remember VR experiences not as a memory of something they saw but as something that happened to them.
4. Curing Our Slack Addiction — an interesting counterpoint to the “in the future everyone will be on 15,000 Slacks” Slack-maximalist view. For AgileBits, it distracted, facilitated, and rewarded distracting behaviour, ultimately becoming a drain rather than an accelerant.

Four short links: 13 April 2016

Gesture Learner, Valuing Maintainers, Google's CS Education, and AI Threats

1. focusmotion.iothe world’s first machine learning SDK to track, learn, and analyze human motion on any sensor, on any OS, on any platform. You (or your users) train it on what combination of sensor patterns to label as a particular gesture or movement, and then it’ll throw those labels whenever.
2. How Maintainers, not Innovators, Make the World Turn (City Lab) — cf Deb Chachra’s Why I Am Not a Maker and everything Warren Buffett ever wrote about investing in boring businesses. It’s nice to realize that we’ve gone from “you’d be crazy to throw your career away and join a startup” to “hey, established industry isn’t bad, either, you know.”
3. Google CS Education — all their tools and resources for CS education in one spot.
4. Will The Proliferation of Affordable AI Decimate the Middle Class? (Alex Tabarrok) — I hadn’t heard this done before, but he steps away from the A in AI to ask whether greater natural intelligence would threaten the middle class in the same way—e.g., from rising India and China.

Four short links: 8 April 2016

Data Security, Bezos Letter, Working Remote, and Deep Learning Book

1. LangSecThe complexity of our computing systems (both software and hardware) have reached such a degree that data must treated as formally as code.
2. Bezos’s Letter to Shareholders — as eloquent about success in high-risk tech as Warren Buffett is about success in value investing.
3. Good Bad and Ugly of Working Remote After 5 Years — good advice, and some realities for homeworkers to deal with.
4. Deep Learning Book — text finished, prepping print production via MIT Press. Why are you using HTML format for the drafts? This format is a sort of weak DRM required by our contract with MIT Press. It’s intended to discourage unauthorized copying/editing of the book.

Four short links: 7 April 2016

Fairness in Machine Learning, Ethical Decision-Making, State of Hardware, and Against Web DRM

1. Fairness in Machine Learning — read this fabulous presentation. Most ML objective functions create models accurate for the majority class at the expense of the protected class. One way to encode “fairness” might be to require similar/equal error rates for protected classes as for the majority population.
2. Safety Constraints and Ethical Principles in Collective Decision-Making Systems (PDF) — self-driving cars are an example of collective decision-making between intelligent agents and, possibly, humans. Damn it’s hard to find non-paywalled research in this area. This is horrifying for the list of things you can’t ensure in collective decision-making systems.
3. State of Hardware Report (Nate Evans) — rundown of the stats related to hardware startups. (via Renee DiResta)
4. A Recent Discussion about DRM (Joi Ito) — strong arguments against including Digital Rights Management in W3C’s web standards (I can’t believe we’re still debating this; it’s such a self-evidently terrible idea to bake disempowerment into web standards).

Four short links: 6 April 2016

1. U.S. Textile Industry Turns to Tech as Gateway to RevivalWarwick Mills is joining the Defense Department, universities including the Massachusetts Institute of Technology, and nearly 50 other companies in an ambitious $320 million project to push the American textile industry into the digital age. Key to the plan is a technical ingredient: embedding a variety of tiny semiconductors and sensors into fabrics that can see, hear, communicate, store energy, warm or cool a person, or monitor the wearer’s health. 2. 2D to 3D With Deep CNNs (PDF) — source code on github. 3. Squeezing AI into Mobile Systems (IEEE Spectrum) — Sze, working with Joel Emer, also an MIT computer science professor and senior distinguished research scientist at Nvidia, developed Eyeriss­, the first custom chip designed to run a state-of-the-art convolutional neural network. They showed they could run AlexNet, a particularly demanding algorithm, using less than one-tenth the energy of a typical mobile GPU: instead of consuming 5 to 10 watts, Eyeriss used 0.3 W. 4. The 8-Bit Game That Makes Statistics Addictive (The Atlantic) — that game is Guess The Correlation. “As a researcher, you read papers and a lot of the time, you eyeball the figures without even reading the text,” he says. “You see a plot—it could even be your own plot—and make a judgment based on it. Contrary to what people believe, they’re not very good at this. And I have the data to prove that.” Four short links: 25 March 2016 Intro to Statistics, Automatic Lip Reading, Outdoor Range Finding for$10, and Wrongful Takedowns

1. Intro Statistics with Randomization and Simulation — free PDF download as well as book for purchase. (via Flowing Data)
2. Automated Lip Reading Invented — press release, but interesting topic. The research will be presented at the International Conference on Acoustics, Speech, and Signal Processing (ICASSP) in Shanghai.
3. A Smartphone-based Laser Distance Sensor for Outdoor Environments (PDF) — We present a low-cost, smartphone-based planar laser distance sensor design for outdoor use with 6 cm accuracy at 5 meters, 30 Hz scan rate, and 0.1 degree resolution over the field of view. The cost of the hardware additions to the off-the-shelf smartphone used in our prototype is under \$50.
4. Internet Archive Seeks to Defend Against Wrongful TakedownsIn its submission, the Archive goes to some lengths to highlight differences between those engaging in commercial piracy and those who seek to preserve and share cultural heritage. As a result, the context in which a user posts content online should be considered before attempting to determine whether an infringement has taken place. This, the organization says, poses problems for the “staydown” demands gaining momentum with copyright holders.

Four short links: 10 March 2016

Cognitivist and Behaviourist AI, Math and Social Computing, A/B Testing Stats, and Rat Cyborgs are Smarter

1. Crossword-Solving Neural NetworksHill describes recent progress in learning-based AI systems in terms of behaviourism and cognitivism: two movements in psychology that effect how one views learning and education. Behaviourism, as the name implies, looks at behaviour without looking at what the brain and neurons are doing, while cognitivism looks at the mental processes that underlie behaviour. Deep learning systems like the one built by Hill and his colleagues reflect a cognitivist approach, but for a system to have something approaching human intelligence, it would have to have a little of both. “Our system can’t go too far beyond the dictionary data on which it was trained, but the ways in which it can are interesting, and make it a surprisingly robust question and answer system – and quite good at solving crossword puzzles,” said Hill. While it was not built with the purpose of solving crossword puzzles, the researchers found that it actually performed better than commercially-available products that are specifically engineered for the task.
2. Mathematical Foundations for Social Computing (PDF) — collection of pointers to existing research in social computing and some open challenges for work to be done. Consider situations where a highly structured decision must be made. Some examples are making budgets, assigning water resources, and setting tax rates. […] One promising candidate is “Knapsack Voting.” […] This captures most budgeting processes — the set of chosen budget items must fit under a spending limit, while maximizing societal value. Goel et al. prove that asking users to compare projects in terms of “value for money” or asking them to choose an entire budget results in provably better properties than using the more traditional approaches of approval or rank-choice voting.
3. Power, Minimal Detectable Effect, and Bucket Size Estimation in A/B Tests (Twitter) — This post describes how Twitter’s A/B testing framework, DDG, addresses one of the most common questions we hear from experimenters, product managers, and engineers: how many users do we need to sample in order to run an informative experiment?
4. Intelligence-Augmented Rat Cyborgs in Maze Solving (PLoS) — We compare the performance of maze solving by computer, by individual rats, and by computer-aided rats (i.e. rat cyborgs). They were asked to find their way from a constant entrance to a constant exit in 14 diverse mazes. Performance of maze solving was measured by steps, coverage rates, and time spent. The experimental results with six rats and their intelligence-augmented rat cyborgs show that rat cyborgs have the best performance in escaping from mazes. These results provide a proof-of-principle demonstration for cyborg intelligence. In addition, our novel cyborg intelligent system (rat cyborg) has great potential in various applications, such as search and rescue in complex terrains.

Four short links: 8 March 2016

Neural Nets on Encrypted Data, IoT VR Prototype, Group Chat Considered Harmful, and Haptic Hardware

1. Neutral Nets on Encrypted Data (Paper a Day) — By using a technique known as homohorphic encryption, it’s possible to perform operations on encrypted data, producing an encrypted result, and then decrypt the result to give back the desired answer. By combining homohorphic encryption with a specially designed neural network that can operate within the constraints of the operations supported, the authors of CryptoNet are able to build an end-to-end system whereby a client can encrypt their data, send it to a cloud service that makes a prediction based on that data – all the while having no idea what the data means, or what the output prediction means – and return an encrypted prediction to the client, which can then decrypt it to recover the prediction. As well as making this possible, another significant challenge the authors had to overcome was making it practical, as homohorphic encryption can be expensive.
2. VR for IoT Prototype (YouTube) — a VR prototype created for displaying sensor data and video streaming in real time from IoT sensors/camera devices designed for rail or the transportation industry.
3. Is Group Chat Making You Sweat? (Jason Fried) — all excellent points. Our attention and focus are the scarce and precious resources of the 21st century.
4. How Devices Provide Haptic Feedback — good intro to what’s happening in your hardware.

Four short links: 4 March 2016

Snapchat's Business, Tracking Voters, Testing for Discriminatory Associations, and Assessing Impact

1. How Snapchat Built a Business by Confusing Olds (Bloomberg) — Advertisers don’t have a lot of good options to reach under-30s. The audiences of CBS, NBC, and ABC are, on average, in their 50s. Cable networks such as CNN and Fox News have it worse, with median viewerships near or past Social Security age. MTV’s median viewers are in their early 20s, but ratings have dropped in recent years. Marketers are understandably anxious, and Spiegel and his deputies have capitalized on those anxieties brilliantly by charging hundreds of thousands of dollars when Snapchat introduces an ad product.
2. Tracking VotersOn the night of the Iowa caucus, Dstillery flagged all the [ad network-mediated ad] auctions that took place on phones in latitudes and longitudes near caucus locations. It wound up spotting 16,000 devices on caucus night, as those people had granted location privileges to the apps or devices that served them ads. It captured those mobile ID’s and then looked up the characteristics associated with those IDs in order to make observations about the kind of people that went to Republican caucus locations (young parents) versus Democrat caucus locations. It drilled down further (e.g., ‘people who like NASCAR voted for Trump and Clinton’) by looking at which candidate won at a particular caucus location.
3. Discovering Unwarranted Associations in Data-Driven Applications with the FairTest Testing Toolkit (arXiv) — We describe FairTest, a testing toolkit that detects unwarranted associations between an algorithm’s outputs (e.g., prices or labels) and user subpopulations, including sensitive groups (e.g., defined by race or gender). FairTest reports statistically significant associations to programmers as association bugs, ranked by their strength and likelihood of being unintentional, rather than necessary effects. See also slides from PrivacyCon. Source code not yet released.
4. Inferring Causal Impact Using Bayesian Structural Time-Series Models (Adrian Colyer) — understanding the impact of an intervention by building a predictive model of what would have happened without the intervention, then diffing reality to that model.