"computer vision" entries

Four short links: 16 December 2015

Face Matching, Engineering Rewrites, Public Domain Illustrations, and Robotic Wrapup

by Nat Torkington | @gnat | +Nat Torkington | December 16, 2015

Face Director — Disney software to match faces between takes. We demonstrate that our method can synthesize visually believable performances with applications in emotion transition, performance correction, and timing control.
Move Fast and Fix Things — blow by blow of an engineering rewrite of some key functionality at GitHub, interesting from a “oh so that’s how they do it” point of view (if blow-by-blow engineering rewrites qualify as “interesting” to you).
Old Book Illustrations — public domain book illustrations, tagged and searchable. Yes, like Font Awesome of engraving.
The State of Robotics for 2015 (TechCrunch) — nice summary/wrapup of what’s out there now.

Four short links: 4 December 2015

Bacterial Research, Open Source Swift, Deep Forger, and Prudent Crypto Engineering

by Nat Torkington | @gnat | +Nat Torkington | December 4, 2015

New Antibiotics Research Direction — most people don’t know that we can’t cultivate and isolate most of the microbes we know about.
Swift now Open Source — Apache v2-licensed. An Apple exec is talking about it and its roadmap.
Deep Forger User Guide — clever Twitter bot converting your photos into paintings in the style of famous artists, using deep learning tech.
Prudent Engineering Practice for Cryptographic Protocols (PDF) — paper from the ’90s that is still useful today. Those principles are good for API design too. (via Adrian Colyer)

Four short links: 3 December 2015

Touchable Holograms, Cloud Vision API, State of Computer Security, and Product Prioritization

by Nat Torkington | @gnat | +Nat Torkington | December 3, 2015

Japanese Scientists Create Touchable Holograms (Reuters) — Using femtosecond laser technology, the researchers developed ‘Fairy Lights, a system that can fire high-frequency laser pulses that last one millionth of one billionth of a second. The pulses respond to human touch, so that – when interrupted – the hologram’s pixels can be manipulated in mid-air.
Google Cloud Vision API — classifies images into thousands of categories (e.g., “boat,” “lion,” “Eiffel Tower”), detects faces with associated emotions, and recognizes printed words in many languages.
Not Even Close: The State of Computer Security (Vimeo) — hilarious James Mickens talk with the best description ever.
20 Product Prioritization Techniques: A Map and Guided Tour — excellent collection of techniques for ordering possible product work.

Four short links: 27 November 2015

Android Insecurity, Clear Photos, Speech to Emotion, and Microexpressions from Video

by Nat Torkington | @gnat | +Nat Torkington | November 27, 2015

87% of Android Devices Insecure — researchers find they’re vulnerable to malicious apps because manufacturers have not provided regular security updates. (via Bruce Schneier)
A Computational Approach for Obstruction-Free Photography (Google Research) — take multiple photos from different angles through occlusions like a window with raindrops or reflections, and their software will assemble an unoccluded image. (via Greg Linden)
Algorithms for Affective Sensing — Results show that the system achieves a six-emotion decision-level correct classification rate of 80% for an acted dataset with clean speech. This PhD thesis is research into algorithm for determining emotion from speech samples, which does so more accurately than humans in a controlled test. (via New Scientist)
Software Learns to Recognise Microexpressions (MIT Technology Review) — Li and co’s machine matched human ability to spot and recognize microexpressions and significantly outperformed humans at the recognition task alone.

Four short links: 20 November 2015

Table Mining, Visual Microphones, Platformed Government, and NP-Hard Video Games

by Nat Torkington | @gnat | +Nat Torkington | November 20, 2015

DeepDive — Stanford project to create structured data (SQL tables) from unstructured information (text documents) and integrate such data with an existing structured database. DeepDive is used to extract sophisticated relationships between entities and make inferences about facts involving those entities. Code is open source (Apache v2 license). (via Infoworld)
Visual Microphone (MIT) — turn everyday objects — a glass of water, a potted plant, a box of tissues, or a bag of chips — into visual microphones using high-speed photography to detect the small vibrations caused by sound. (via Infoworld)
10 Rules for Distributed/Networked/Platformed Government (Richard Pope) — Be as vigilant against creating concentrations of power as you are in creating efficiency or bad user experiences. (via Paul Downey)
Classic Nintendo Games are (Computationally) Hard — We prove NP-hardness results for five of Nintendo’s largest video game franchises: Mario, Donkey Kong, Legend of Zelda, Metroid, and Pokemon.

Four short links: 16 November 2015

Hospital Hacking, Security Data Science, Javascript Face-Substitution, and Multi-Agent Systems Textbook

by Nat Torkington | @gnat | +Nat Torkington | November 16, 2015

Hospital Hacking (Bloomberg) — interesting for both lax regulation (“The FDA seems to literally be waiting for someone to be killed before they can say, ‘OK, yeah, this is something we need to worry about,’ ” Rios says.) and the extent of the problem (Last fall, analysts with TrapX Security, a firm based in San Mateo, Calif., began installing software in more than 60 hospitals to trace medical device hacks. […] After six months, TrapX concluded that all of the hospitals contained medical devices that had been infected by malware.). It may take a Vice President’s defibrillator being hacked for things to change. Or would anybody notice?
Cybersecurity and Data Science — pointers to papers in different aspects of using machine learning and statistics to identify misuse and anomalies.
Real-time Face Substitution in Javascript — this is awesome. Moore’s Law is amazing.
Multi-Agent Systems — undergraduate textbook covering distributed systems, game theory, auctions, and more. Electronic version as well as printed book.

Four short links: 22 October 2015

Predicting activity, systems replacement fail, Khan React style, and an interoperability system for the Web

by Nat Torkington | @gnat | +Nat Torkington | October 22, 2015

Predicting Daily Activities from Egocentric Images Using Deep Learning — Our technique achieves an overall accuracy of 83.07% in predicting a person’s activity [from images taken by a camera worn all day by a person] across the 19 activity classes.
Trying to Replace Multiple Systems with One Can Lead to None (IEEE) — check out that final graph, it’s a doozy. It’s a graph of x against time, from various “this project is great, it will replace x systems with 1″ claims about a single project. Software projects should come with giant warning labels: “most fail, you are about to set your money on fire. Are you sure? [Y/N/Abort/Restart]”
Khan React Style Guide — in case you’re dipping your toes into the cool kids’ pool.
ballista — An interoperability system for the modern Web. Like intents.

Four short links: 20 October 2015

HyperCam, half-arsed software development, perceptions of productivity, John McCarthy's conditional expressions

by Nat Torkington | @gnat | +Nat Torkington | October 20, 2015

HyperCam (PDF) — paper from Ubicomp 2015 on a low-cost implementation of a multispectral camera and a software approach that automatically analyzes the scene and provides a user with an optimal set of images that try to capture the salient information of the scene. Can see ripeness of fruit, and veins in hands.
Manifesto for Half-Arsed Software Development — Responding to change over following a plan … provided a detailed plan is in place to respond to the change, and it is followed precisely.
Software Developers’ Perceptions of Productivity — In both studies, we found that developers perceive their days as productive when they complete many or big tasks without significant interruptions or context switches. Yet, the observational data we collected shows our participants performed significant task and activity switching while still feeling productive. (via Never Work in Theory)
The Language of Choice — In the ’50s John McCarthy invented conditional expressions. Utility computing, AI, Lisp, and now what I know as C’s ?: syntax. His legend lives on.

Four short links: 1 September 2015

People Detection, Ratings Patterns, Inspection Bias, and Cloud Filesystem

by Nat Torkington | @gnat | +Nat Torkington | September 1, 2015

End-to-End People Detection in Crowded Scenes — research paper and code. When parsing the title, bind “end-to-end” to “scenes” not “people”.
Statistical Patterns in Movie Ratings (PLOSone) — We find that the distribution of votes presents scale-free behavior over several orders of magnitude, with an exponent very close to 3/2, with exponential cutoff. It is remarkable that this pattern emerges independently of movie attributes such as average rating, age and genre, with the exception of a few genres and of high-budget films.
The Inspection Bias is Everywhere — In 1991, Scott Feld presented the “friendship paradox”: the observation that most people have fewer friends than their friends have. He studied real-life friends, but the same effect appears in online networks: if you choose a random Facebook user, and then choose one of their friends at random, the chance is about 80% that the friend has more friends. The friendship paradox is a form of the inspection paradox. When you choose a random user, every user is equally likely. But when you choose one of their friends, you are more likely to choose someone with a lot of friends. Specifically, someone with x friends is overrepresented by a factor of x.
s3ql — a file system that stores all its data online using storage services like Google Storage, Amazon S3, or OpenStack. S3QL effectively provides a hard disk of dynamic, infinite capacity that can be accessed from any computer with internet access running Linux, FreeBSD or OS-X. (GPLv3)

Four short links: 13 August 2015

Learning Style, Artisinal Cash, Docs at Scale, and Homophily Research

by Nat Torkington | @gnat | +Nat Torkington | August 13, 2015

Elements of Style: Learning Perceptual Shape Style Similarity — code and data for research that helps perceive stylistic similarity between objects that transcends structure and function. For example, we can see a common style such as “Danish modern” in both a table and chair, though they have different structures. Until now, machines have found it difficult to do the same. (That quote cribbed from the phys.org writeup) Our new AI overlords may be cruel and heartless, but they’ll be able to tell Danish Modern from Shaker.
The Advent of Artisinal Cash (NY Times) — details the rise of local physical currency around the world. Nonetheless, the use of traditional paper money is clearly on the wane. Perhaps these smaller, more attractive artisanal paper notes are merely last bursts of glory before it disappears entirely. Though as Mr. Deller, the artist behind the latest Brixton pound, said, “As long as there are drug deals and criminality, there’ll be a need for cash.”
Documentation at Scale — 1. Acknowledge that brute force doesn’t work; 2. Make documentation a first class citizen; 3. Make documentation executable; 4. Track the intent.
Exposure to Ideologically Diverse Information on Facebook (Facebook Research) — Friends shared substantially less cross-cutting news from sources aligned with an opposing ideology. People encountered roughly 15% less cross-cutting content in news feeds due to algorithmic ranking and clicked through to 70% less of this cross-cutting content. Within the domain of political news encountered in social media, selective exposure appears to drive attention.