ENTRIES TAGGED "machine learning"

Four short links: 1 September 2014

Four short links: 1 September 2014

Sibyl, Bitrot, Estimation, and ssh

  1. Sibyl: Google’s System for Large Scale Machine Learning (YouTube) — keynote at DSN2014 acting as an intro to Sibyl. (via KD Nuggets)
  2. Bitrot from 1997That’s 205 failures, an actual link rot figure of 91%, not 57%. That leaves only 21 URLs as 200 OK and containing effectively the same content.
  3. What We Do And Don’t Know About Software Effort Estimation — nice rundown of research in the field.
  4. fabric — simple yet powerful ssh library for Python.
Comment
Four short links: 27 August 2014

Four short links: 27 August 2014

Discourse 1.0, Programmable Matter, Versioned Databases, and What Humans Learned About Machine Learning

  1. Discourse turns 1.0 — community/forum software that doesn’t suck.
  2. Programmable Matter (IEEE Spectrum) — recap of where research is going in this area.
  3. Liquibasesource control for your database. Apache 2.0 licensed.
  4. A Few Useful Things to Know About Machine Learning (PDF) — This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions. My fave: First-timers are often surprised by how little time in a machine learning project is spent actually doing machine learning. But it makes sense if you consider how time-consuming it is to gather data, integrate it, clean it and pre-process it, and how much trial and error can go into feature design.
Comments: 2
Four short links: 20 August 2014

Four short links: 20 August 2014

Plant Properties, MQ Comparisons, 1915 Vis, and Mobile Web Weaknesses

  1. Machine Learning for Plant Properties — startup building database of plant genomics, properties, research, etc. for mining. The more familiar you are with your data and its meaning, the better your machine learning will be at suggesting fruitful lines of query … and the more valuable your startup will be.
  2. Dissecting Message Queues — throughput, latency, and qualitative comparison of different message queues. MQs are to modern distributed architectures what function calls were to historic unibox architectures.
  3. 1915 Data Visualization Rules — a reminder that data visualization is not new, but research into effectiveness of alternative presentation styles is.
  4. The Broken Promise of the Mobile Webit’s not just about the UI – it’s also about integration with the mobile device.
Comment
Four short links: 7 August 2014

Four short links: 7 August 2014

Material Design, Stewart's Slack, Sketching in Javascript, and Neural Networks and Deep Learning

  1. Material Design in the Google I/O App (Medium) — steps through design thinking as they put Google’s new design metaphor in place. I’ve been chewing on material design. It brings an internal consistency and logic to the Android world that Apple’s iOS and OS X visual worlds have been losing over the years. How long until web users expect this consistency too?
  2. Stewart and Slack (Wired) — profile of Foo Stewart Butterfield and his shiny Slack startup.
  3. p5js — a new Processing-inspired code-as-sketching in Javascript. Using the original metaphor of a software sketchbook, p5.js has a full set of drawing functionality. However, you’re not limited to your drawing canvas, you can think of your whole browser page as your sketch!
  4. Neural Networks and Deep Learning — a free online book to teach you … well, neural networks and deep learning.
Comment
Four short links: 24 July 2014

Four short links: 24 July 2014

Neglected ML, Crowdfunded Recognition, Debating Watson, and Versioned p2p File System

  1. Neglected Machine Learning IdeasPerhaps my list is a “send me review articles and book suggestions” cry for help, but perhaps it is useful to others as an overview of neat things.
  2. First Crowdfunded Book on Booker Shortlist — Booker excludes self-published works, but “The Wake” was through Unbound, a Threadless-style “if we hit this limit, the book is printed and you have bought a copy” site.
  3. Watson Can Debate Its Opponents (io9) — Speaking in nearly perfect English, Watson/The Debater replied: “Scanned approximately 4 million Wikipedia articles, returning ten most relevant articles. Scanned all 3,000 sentences in top ten articles. Detected sentences which contain candidate claims. Identified borders of candidate claims. Assessed pro and con polarity of candidate claims. Constructed demo speech with top claim predictions. Ready to deliver.”
  4. ipfsa global, versioned, peer-to-peer file system. It combines good ideas from Git, BitTorrent, Kademlia, and SFS. You can think of it like a single BitTorrent swarm, exchanging Git objects, making up the web. IPFS provides an interface much simpler than HTTP, but has permanence built in.. (via Sourcegraph)
Comment
Four short links: 21 July 2014

Four short links: 21 July 2014

Numenta Code, Soccer Robotics, Security Data Science, Open Wireless Router

  1. nupic (github) -GPL v3-licensed ode from Numenta, at last. See their patent position.
  2. Robocup — soccer robotics contest, condition of entry is that all codes are open sourced after the contest. (via The Economist)
  3. Security Data Science Paper Collection — machine learning, big data, analysis, reports, all around security issues.
  4. Building an Open Wireless Router — EFF call for coders to help build a wireless router that’s more secure and more supportive of open sharing than current devices.

Comment
Four short links: 15 July 2014

Four short links: 15 July 2014

Data Brokers, Car Data, Pattern Classification, and Hogwild Deep Learning

  1. Inside Data Brokers — very readable explanation of the data brokers and how their information is used to track advertising effectiveness.
  2. Elon, I Want My Data! — Telsa don’t give you access to the data that your cars collects. Bodes poorly for the Internet of Sealed Boxes. (via BoingBoing)
  3. Pattern Classification (Github) — collection of tutorials and examples for solving and understanding machine learning and pattern classification tasks.
  4. HOGWILD! (PDF) — the algorithm that Microsoft credit with the success of their Adam deep learning system.
Comment
Four short links: 14 July 2014

Four short links: 14 July 2014

Scanner Malware, Cognitive Biases, Deep Learning, and Community Metrics

  1. Handheld Scanners Attack — shipping and logistics operations compromised by handheld scanners running malware-infested Windows XP.
  2. Adventures in Cognitive Biases (MIT) — web adventure to build your cognitive defences against biases.
  3. Quoc Le’s Lectures on Deep Learning — Machine Learning Summer School videos (4k!) of the deep learning lectures by Google Brain team member Quoc Le.
  4. FLOSS Community Metrics Talks — upcoming event at Puppet Labs in Portland. I hope they publish slides and video!
Comment: 1
Four short links: 1 July 2014

Four short links: 1 July 2014

Efficient Representation, Page Rendering, Graph Database, Warning Effectiveness

  1. word2vecThis tool provides an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words. These representations can be subsequently used in many natural language processing applications and for further research. From Google Research paper Efficient Estimation of Word Representations in Vector Space.
  2. What Every Frontend Developer Should Know about Page RenderingRendering has to be optimized from the very beginning, when the page layout is being defined, as styles and scripts play the crucial role in page rendering. Professionals have to know certain tricks to avoid performance problems. This arcticle does not study the inner browser mechanics in detail, but rather offers some common principles.
  3. Cayleyan open-source graph inspired by the graph database behind Freebase and Google’s Knowledge Graph.
  4. Alice in Warningland (PDF) — We performed a field study with Google Chrome and Mozilla Firefox’s telemetry platforms, allowing us to collect data on 25,405,944 warning impressions. We find that browser security warnings can be successful: users clicked through fewer than a quarter of both browser’s malware and phishing warnings and third of Mozilla Firefox’s SSL warnings. We also find clickthrough rates as high as 70.2% for Google Chrome SSL warnings, indicating that the user experience of a warning can have tremendous impact on user behaviour.
Comments: 7
Four short links: 16 June 2014

Four short links: 16 June 2014

Decision Trees, Decision Modifications, Mobile Patents, Web Client

  1. Quick DT — open source (Java) decision tree learner.
  2. Revealing Hidden Changes to Supreme Court OpinionsWHEREAS, It is now well-documented that the Supreme Court of the United States makes changes to its opinions after the opinion is published; and WHEREAS, Only “Four legal publishers are granted access to “change pages” that show all revisions. Those documents are not made public, and the court refused to provide copies to The New York Times”; and WHEREAS, git makes it easy to identify when changes have been made; RESOLVED, I shall apply a cron job to at least identify when the actual PDF has changed so everyone can see which documents have changed.
  3. Microsoft’s “Killer” Android Patents Revealed (Ars Technica) — Chinese Government required them disclosed as part of MSFT-Nokia merger. The patent lists are strategically significant, because Microsoft has managed to build a huge patent-licensing business by taxing Android phones without revealing what kind of legal leverage they really have over those phones.
  4. HTTPiea command line HTTP client, a user-friendly HTTP client.
Comment