"machine learning" entries

Four short links: 6 February 2015

Four short links: 6 February 2015

Active Learning, Tongue Sensors, Cybernetic Management, and HTML5 Game Publishing

  1. Real World Active Learningthe point at which algorithms fail is precisely where there’s an opportunity to insert human judgment to actively improve the algorithm’s performance. An O’Reilly report with CrowdFlower.
  2. Hearing With Your Tongue (BoingBoing) — The tongue contains thousands of nerves, and the region of the brain that interprets touch sensations from the tongue is capable of decoding complicated information. “What we are trying to do is another form of sensory substitution,” Williams said.
  3. The Art of Management — cybernetics and management.
  4. kiwi.jsa mobile & desktop browser based HTML5 game framework. It uses CocoonJS for publishing to the AppStore.
Comment: 1

Human-in-the-loop machine learning

Practical machine-learning applications and strategies from experts in active learning.

What do you call a practice that most data scientists have heard of, few have tried, and even fewer know how to do well? It turns out, no one is quite certain what to call it. In our latest free report Real-World Active Learning: Applications and Strategies for Human-in-the-Loop Machine Learning, we examine the relatively new field of “active learning” — also referred to as “human computation,” “human-machine hybrid systems,” and “human-in-the-loop machine learning.” Whatever you call it, the field is exploding with practical applications that are proving the efficiency of combining human and machine intelligence.

Learn from the experts

Through in-depth interviews with experts in the field of active learning and crowdsource management, industry analyst Ted Cuzzillo reveals top tips and strategies for using short-term human intervention to actively improve machine models. As you’ll discover, the point at which a machine model fails is precisely where there’s an opportunity to insert — and benefit from — human judgment.

Find out:

  • When active learning works best
  • How to manage crowdsource contributors (including expert-level contributors)
  • Basic principles of labeling data
  • Best practice methods for assessing labels
  • When to skip the crowd and mine your own data

Explore real-world examples

This report gives you a behind-the-scenes look at how human-in-the-loop machine learning has helped improve the accuracy of Google Maps, match business listings at GoDaddy, rank top search results at Yahoo!, refer relevant job postings to people on LinkedIn, identify expert-level contributors using the Quizz recruitment method, and recommend women’s clothing based on customer and product data at Stitch Fix. Read more…

Comment: 1
Four short links: 2 February 2015

Four short links: 2 February 2015

Weather Forecasting, Better Topic Modelling, Cyberdefense, and Facebook Warriors

  1. Global Forecast System — National Weather Service open sources its weather forecasting software. Hope you have a supercomputer and all the data to make use of it …
  2. High-reproducibility and high-accuracy method for automated topic classificationLatent Dirichlet allocation (LDA) is the state of the art in topic modeling. Here, we perform a systematic theoretical and numerical analysis that demonstrates that current optimization techniques for LDA often yield results that are not accurate in inferring the most suitable model parameters. Adapting approaches from community detection in networks, we propose a new algorithm that displays high reproducibility and high accuracy and also has high computational efficiency. We apply it to a large set of documents in the English Wikipedia and reveal its hierarchical structure.
  3. Army Open Sources Cyberdefense Codegit push is the new “for immediate release”.
  4. British Army Creates Team of Facebook Warriors (The Guardian) — no matter how much I know the arguments for it, it still feels vile.
Comment: 1
Four short links: 19 January 2015

Four short links: 19 January 2015

Going Offline, AI Ethics, Human Risks, and Deep Learning

  1. Reset (Rowan Simpson) — It was a bit chilling to go back over a whole years worth of tweets and discover how many of them were just junk. Visiting the water cooler is fine, but somebody who spends all day there has no right to talk of being full.
  2. Google’s AI Brain — on the subject of Google’s AI ethics committee … Q: Will you eventually release the names? A: Potentially. That’s something also to be discussed. Q: Transparency is important in this too. A: Sure, sure. Such reassuring.
  3. AVA is now Open Source (Laura Bell) — Assessment, Visualization and Analysis of human organisational information security risk. AVA maps the realities of your organisation, its structures and behaviors. This map of people and interconnected entities can then be tested using a unique suite of customisable, on-demand, and scheduled information security awareness tests.
  4. Deep Learning for Torch (Facebook) — Facebook AI Research open sources faster deep learning modules for Torch, a scientific computing framework with wide support for machine learning algorithms.
Four short links: 12 January 2015

Four short links: 12 January 2015

Designed-In Outrage, Continuous Data Processing, Lisp Processors, and Anomaly Detection

  1. The Toxoplasma of RageIt’s in activists’ interests to destroy their own causes by focusing on the most controversial cases and principles, the ones that muddy the waters and make people oppose them out of spite. And it’s in the media’s interest to help them and egg them on.
  2. Samza: LinkedIn’s Stream-Processing EngineSamza’s goal is to provide a lightweight framework for continuous data processing. Unlike batch processing systems such as Hadoop, which typically has high-latency responses (sometimes hours), Samza continuously computes results as data arrives, which makes sub-second response times possible.
  3. Design of LISP-Based Processors (PDF) — 1979 MIT AI Lab memo on design of hardware specifically for Lisp. Legendary subtitle! LAMBDA: The Ultimate Opcode.
  4. rAnomalyDetection — Twitter’s R package for detecting anomalies in time-series data. (via Twitter Engineering blog)
Four short links: 9 January 2015

Four short links: 9 January 2015

Complex Addresses, AI Applications, Scaling Diversity, Audiovisual Coding

  1. Falsehoods Programmers Believe About Addresses0 Egmont Road, Middlesbrough. lolwut?
  2. Future of the AI-Powered Application (Matt Turck) — we’re about to witness the emergence of a number of deeply focused AI-powered applications that will achieve commercial success by solving in a definitive manner very specific issues. (via Matt Webb)
  3. Three Things a City In Charge of its Destiny Ought to Know About Software (Matt Edgar) — Instead of asking “will it scale”, ask a better question: “Does it gracefully handle massive diversity?” […] The diversity question accommodates scaling; the scaling question tramples all over diversity. (via Tom Armitage)
  4. gibbera creative coding environment for audiovisual performance and composition. It contains features for audio synthesis and musical sequencing, 2d drawing, 3d scene construction and manipulation, and live-coding shaders. If you’re looking for more ways to interest teens in code …
Comment: 1
Four short links: 6 January 2015

Four short links: 6 January 2015

IoT Protocols, Predictive Limits, Machine Learning and Security, and 3D-Printing Electronics

  1. Exploring the Protocols of the Internet of Things (Sparkfun) — Arduino and Arduino-like IoT “things” especially, with their limited flash and SRAM, can benefit from specially crafted IoT protocols.
  2. Complexity Salon: Ebola (willowbl00) — These notes were taken at the 2014.Dec.18 New England Complex Systems Institute Salon focused on Ebola. […] Why don’t we engage in risks in a more serious way? Everyone thinks their prior experience indicates what will happen in the future. Look at past Ebola! It died down before going far, surely it won’t be bad in the future.
  3. Machine Learning Methods for Computer Security (PDF) — papers on topics such as adversarial machine learning, attacking pattern recognition systems, data privacy and machine learning, machine learning in forensics, and deceiving authorship detection.
  4. voxel8Using Voxel8’s 3D printer, you can co-print matrix materials such as thermoplastics and highly conductive silver inks enabling customized electronic devices like quadcopters, electromagnets and fully functional 3D electromechanical assemblies.
Four short links: 25 December 2015

Four short links: 25 December 2015

Smart Cities, Blockchain Innovation, Brain Interfaces, and Knowledge Graphs

  1. Smartest Cities Rely on Citizen Cunning and Unglamorous Technology (The Guardian) — vendors like Microsoft, IBM, Siemens, Cisco and Hitachi construct the resident of the smart city as someone without agency; merely a passive consumer of municipal services – at best, perhaps, a generator of data that can later be aggregated, mined for relevant inference, and acted upon. Should he or she attempt to practise democracy in any form that spills on to the public way, the smart city has no way of accounting for this activity other than interpreting it as an untoward disruption to the orderly flow of circulation.
  2. Second Wave of Blockchain Innovation — the economic challenges of innovating on the blockchain.
  3. Introduction to the Modern Brain-Computer Interface Design (UCSD) — The lectures were first given by Christian Kothe (SCCN/UCSD) in 2012 at University of Osnabrueck within the Cognitive Science curriculum and have now been recorded in the form of an open online course. The course includes basics of EEG, BCI, signal processing, machine learning, and also contains tutorials on using BCILAB and the lab streaming layer software.
  4. Machine Learning with Knowledge Graphs (video) — see also extra readings.
Four short links: 24 December 2014

Four short links: 24 December 2014

DRMed Objects, Eventual Consistency, Complex Systems, and Machine Learning Papers

  1. DRMed Cat Litter Box — the future is when you don’t own what you buy, and it’s illegal to make it work better. (via BoingBoing)
  2. Are We Consistent Yet? — the eventuality of consistency on different cloud platforms.
  3. How Complex Systems Fail (YouTube) — Richard Cook’s Velocity 2012 keynote.
  4. Interesting papers from NIPS 2014 — machine learning holiday reading.

Cheap sensors, fast networks, and distributed computing

The history of computing has been a constant pendulum — that pendulum is now swinging back toward distribution.

Editor’s note: this is an excerpt from our new report Data: Emerging Trends and Technologies, by Alistair Croll. You can download the free report here.

The trifecta of cheap sensors, fast networks, and distributing computing are changing how we work with data. But making sense of all that data takes help, which is arriving in the form of machine learning. Here’s one view of how that might play out.

Clouds, edges, fog, and the pendulum of distributed computing

The history of computing has been a constant pendulum, swinging between centralization and distribution.

The first computers filled rooms, and operators were physically within them, switching toggles and turning wheels. Then came mainframes, which were centralized, with dumb terminals.

As the cost of computing dropped and the applications became more democratized, user interfaces mattered more. The smarter clients at the edge became the first personal computers; many broke free of the network entirely. The client got the glory; the server merely handled queries.

Once the web arrived, we centralized again. LAMP (Linux, Apache, MySQL, PHP) buried deep inside data centers, with the computer at the other end of the connection relegated to little more than a smart terminal rendering HTML. Load-balancers sprayed traffic across thousands of cheap machines. Eventually, the web turned from static sites to complex software as a service (SaaS) applications.

Then the pendulum swung back to the edge, and the clients got smart again. First with AJAX, Java, and Flash; then in the form of mobile apps, where the smartphone or tablet did most of the hard work and the back end was a communications channel for reporting the results of local action. Read more…