ENTRIES TAGGED "research"
Data Pipeline, Data Driven Education, Crowdsourced Proofreading, and 3D Printed Shoes
- Suro (Github) — Netflix data pipeline service for large volumes of event data. (via Ben Lorica)
- NIPS Workshop on Data Driven Education — lots of research papers around machine learning, MOOC data, etc.
- Proofist — crowdsourced proofreading game.
- 3D-Printed Shoes (YouTube) — LeWeb talk from founder of the company, Continuum Fashion). (via Brady Forrest)
- SAMOA — Yahoo!’s distributed streaming machine learning (ML) framework that contains a programming abstraction for distributed streaming ML algorithms. (via Introducing SAMOA)
- madlib — an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data.
- Data Portraits: Connecting People of Opposing Views — Yahoo! Labs research to break the filter bubble. Connect people who disagree on issue X (e.g., abortion) but who agree on issue Y (e.g., Latin American interventionism), and present the differences and similarities visually (they used wordclouds). Our results suggest that organic visualisation may revert the negative effects of providing potentially sensitive content. (via MIT Technology Review)
- Disguise Detection — using Raspberry Pi, Arduino, and Python.
ISS Malware, Computational Creativity, Happy Birthday Go, Built Environment for Surveillance
- ISS Enjoys Malware — Kaspersky reveals ISS had XP malware infestation before they shifted to Linux. The Gravity movie would have had more registry editing sessions if the producers had cared about FACTUAL ACCURACY.
- Big Data Approach to Computational Creativity (Arxiv) — although the “results” are a little weak (methodology for assessing creativity not described, and this sadly subjective line “professional chefs at various hotels, restaurants, and culinary schools have indicated that the system helps them explore new vistas in food”), the process and mechanism are fantastic. Bayesian surprise, crowdsourced tagged recipes, dictionaries of volatile compounds, and more. (via MIT Technology Review)
- Go at 4 — recapping four years of Go language growth.
- Las Vegas Street Lights to Record Conversations (Daily Mail) — The wireless, LED lighting, computer-operated lights are not only capable of illuminating streets, they can also play music, interact with pedestrians and are equipped with video screens, which can display police alerts, weather alerts and traffic information. The high tech lights can also stream live video of activity in the surrounding area. Technology vendor is Intellistreets. LV says, Right now our intention is not to have any cameras or recording devices. Love that “right now”. Can’t wait for malware to infest it.
Coding for Unreliability, AirBnB JS Style, Category Theory, and Text Processing
- Quantitative Reliability of Programs That Execute on Unreliable Hardware (MIT) — As MIT’s press release put it: Rely simply steps through the intermediate representation, folding the probability that each instruction will yield the right answer into an estimation of the overall variability of the program’s output. (via Pete Warden)
- Category Theory for Scientists (MIT Courseware) — Scooby snacks for rationalists.
- Textblob — Python open source text processing library with sentiment analysis, PoS tagging, term extraction, and more.
Android Crypto, Behaviour Trees, Complexity Cheatsheet, and Open Source Game Theory
- An Empirical Study of Cryptographic Misuse in Android Applications (PDF) We develop program analysis techniques to automatically check programs on the Google Play marketplace, that 10,327 out of 11,748 applications that use cryptographic APIs (88% overall) make at least one mistake.
- Introduction to Behaviour Trees — DAGs with codey nodes. Behavior trees replace the often intangible growing mess of state transitions of finite state machines (FSMs) with a more restrictive but also more structured traversal defining approach.
- P vs NP Cheat Sheet — the space and time Big-O complexities of common algorithms used in Computer Science.
- Game Theory and Network Effects in Open Source — delicate balance of incentives go into the decision for companies to Open Source or close source their software in the midst of discussions of Nash Equilibria. Enjoy.
Visual Arduino Coding, Hardware Iteration, Segmenting Images, and Client-Side Adjustable Data View
- Visually Programming Arduino — good for little minds.
- Rapid Hardware Iteration at Scale (Forbes) — It’s part of the unique way that Xiaomi operates, closely analyzing the user feedback it gets on its smartphones and following the suggestions it likes for the next batch of 100,000 phones. It releases them every Tuesday at noon Beijing time.
- Machine Learning of Hierarchical Clustering to Segment 2D and 3D Images (PLoS One) — We propose an active learning approach for performing hierarchical agglomerative segmentation from superpixels. Our method combines multiple features at all scales of the agglomerative process, works for data with an arbitrary number of dimensions, and scales to very large datasets.
- Kratu — an Open Source client-side analysis framework to create simple yet powerful renditions of data. It allows you to dynamically adjust your view of the data to highlight issues, opportunities and correlations in the data.
Publishing Bad Research, Reproducing Research, DIY Police Scanner, and Inventing the Future
- Science Not as Self-Correcting As It Thinks (Economist) — REALLY good discussion of the shortcomings in statistical practice by scientists, peer-review failures, and the complexities of experimental procedure and fuzziness of what reproducibility might actually mean.
- Reproducibility Initiative Receives Grant to Validate Landmark Cancer Studies — The key experimental findings from each cancer study will be replicated by experts from the Science Exchange network according to best practices for replication established by the Center for Open Science through the Center’s Open Science Framework, and the impact of the replications will be tracked on Mendeley’s research analytics platform. All of the ultimate publications and data will be freely available online, providing the first publicly available complete dataset of replicated biomedical research and representing a major advancement in the study of reproducibility of research.
- $20 SDR Police Scanner — using software-defined radio to listen to the police band.
- Reimagine the Chemistry Set — $50k prize in contest to design a “chemistry set” type kit that will engage kids as young as 8 and inspire people who are 88. We’re looking for ideas that encourage kids to explore, create, build and question. We’re looking for ideas that honor kids’ curiosity about how things work. Backed by the Moore Foundation and Society for Science and the Public.
Recognising Hand Gestures, Drone Conference, Stubbornly Open Codes, and Remote Mobile Display
- An Interactive Machine Learning System for Recognizing Hand Gestures (Greg Borenstein) — a mixed-initiative interactive machine learning system for recognizing hand gestures. It attempts to give the user visibility into the classifier’s prediction confidence and control of the conditions under which the system actively requests labeled gestures when its predictions are uncertain. (an exercise for his MIT class)
- First Drone Conference Takes Off (Makezine) — forgive them the puns, Lord, for they know not what they do … uble intendre. Write-up fascinating beyond the headline. Dr. Vijay Kumar of the University of Pennsylvania School of Engineering spoke about socially positive uses for aerial robotics, such as emergency first responders. Dr. Kumar’s work focuses on micro aerial vehicles. He explains that, “size does matter.” As robots get smaller, mass and inertial is reduced. If you halve the mass, the acceleration doubles and the angular acceleration quadruples. This makes for a robot that is fast and responsive, ideal for operating indoors or out, and perfect for search and rescue missions in collapsed buildings or around other hazards.
- Standing Up to Mississippi (Carl Malamud) — yesterday we received a Certified Letter from the Attorney General’s Special Assistant Attorney General demanding that we remove these materials from the Internet and all other electronic or non-electronic media. There was no email address, so I proceeded to prepare a 67-page return reply with Exhibits A-L. I thought folks might be interested in the 7 steps of the production process. Give to his Kickstarter project, folks!
- Open Project (PDF) — A lightweight framework for remote sharing of mobile applications. Sounds like malware but is Google Research project.
Neuromancer Game, Ray Ozzie, Sentiment Analysis, and Open Science Prizes
- Case and Molly, a Game Inspired by Neuromancer (Greg Borenstein) — On reading Neuromancer today, this dynamic feels all too familiar. We constantly navigate the tension between the physical and the digital in a state of continuous partial attention. We try to walk down the street while sending text messages or looking up GPS directions. We mix focused work with a stream of instant message and social media conversations. We dive into the sudden and remote intimacy of seeing a family member’s face appear on FaceTime or Google Hangout. “Case and Molly” uses the mechanics and aesthetics of Neuromancer’s account of cyberspace/meatspace coordination to explore this dynamic.
- Rethinking Ray Ozzie — an inescapable conclusion: Ray Ozzie was right. And Microsoft’s senior leadership did not listen, certainly not at the time, and perhaps not until it was too late. Hear, hear!
- Recursive Deep Models for Semantic Compositionality
Over a Sentiment Treebank (PDF) — apparently it nails sentiment analysis, and will be “open sourced”. At least, according to this GigaOm piece, which also explains how it works.
- PLoS ASAP Award Finalists Announced — with pointers to interviews with the finalists, doing open access good work like disambiguating species names and doing open source drug discovery.