AI’s dueling definitions

Why my understanding of AI is different from yours.


SoftBank’s Pepper, a humanoid robot that takes its surroundings into consideration.

Editor’s note: this post is part of our Intelligence Matters investigation.

Let me start with a secret: I feel self-conscious when I use the terms “AI” and “artificial intelligence.” Sometimes, I’m downright embarrassed by them.

Before I get into why, though, answer this question: what pops into your head when you hear the phrase artificial intelligence?

For the layperson, AI might still conjure HAL’s unblinking red eye, and all the misfortune that ensued when he became so tragically confused. Others jump to the replicants of Blade Runner or more recent movie robots. Those who have been around the field for some time, though, might instead remember the “old days” of AI — whether with nostalgia or a shudder — when intelligence was thought to primarily involve logical reasoning, and truly intelligent machines seemed just a summer’s work away. And for those steeped in today’s big-data-obsessed tech industry, “AI” can seem like nothing more than a high-falutin’ synonym for the machine-learning and predictive-analytics algorithms that are already hard at work optimizing and personalizing the ads we see and the offers we get — it’s the term that gets trotted out when we want to put a high sheen on things. Read more…

Comments: 5
Four short links: 16 June 2014

Four short links: 16 June 2014

Decision Trees, Decision Modifications, Mobile Patents, Web Client

  1. Quick DT — open source (Java) decision tree learner.
  2. Revealing Hidden Changes to Supreme Court OpinionsWHEREAS, It is now well-documented that the Supreme Court of the United States makes changes to its opinions after the opinion is published; and WHEREAS, Only “Four legal publishers are granted access to “change pages” that show all revisions. Those documents are not made public, and the court refused to provide copies to The New York Times”; and WHEREAS, git makes it easy to identify when changes have been made; RESOLVED, I shall apply a cron job to at least identify when the actual PDF has changed so everyone can see which documents have changed.
  3. Microsoft’s “Killer” Android Patents Revealed (Ars Technica) — Chinese Government required them disclosed as part of MSFT-Nokia merger. The patent lists are strategically significant, because Microsoft has managed to build a huge patent-licensing business by taxing Android phones without revealing what kind of legal leverage they really have over those phones.
  4. HTTPiea command line HTTP client, a user-friendly HTTP client.

Streamlining feature engineering

Researchers and startups are building tools that enable feature discovery.

Why do data scientists spend so much time on data wrangling and data preparation? In many cases it’s because they want access to the best variables with which to build their models. These variables are known as features in machine-learning parlance. For many0 data applications, feature engineering and feature selection are just as (if not more important) than choice of algorithm:

Good features allow a simple model to beat a complex model.
(to paraphrase Alon Halevy, Peter Norvig, and Fernando Pereira)

The terminology can be a bit confusing, but to put things in context one can simplify the data science pipeline to highlight the importance of features:

Feature engineering and discovery pipeline

Feature Engineering or the Creation of New Features
A simple example to keep in mind is text mining. One starts with raw text (documents) and extracted features could be individual words or phrases. In this setting, a feature could indicate the frequency of a specific word or phrase. Features1 are then used to classify and cluster documents, or extract topics associated with the raw text. The process usually involves the creation2 of new features (feature engineering) and identifying the most essential ones (feature selection).

Read more…

Four short links: 13 June 2014

Four short links: 13 June 2014

Decentralized Web, Reproducibility Talk, Javascript Microcontroller, and Docker Maturity

  1. Mapping the Decentralized Movement (Jon Udell) — the pendulum is about to swing back toward a more distributed Web.
  2. John Ioannidis: Reproducible Research, True or False? (YouTube) — his talk at Google. (via Paul Kedrosky)
  3. Tessel — a microcontroller that runs Javascript. For those who can’t handle C.</troll>
  4. Docker MisconceptionsThis is not impossible and can all be done – several large companies are already using Docker in production, but it’s definitely non-trivial. This will change as the ecosystem around Docker matures (via Flynn, Docker container hosting, etc), but currently if you’re going to attempt using Docker seriously in production, you need to be pretty skilled at systems management and orchestration.
Four short links: 12 June 2014

Four short links: 12 June 2014

Our New Robot Overlords, Open Neuro, Anti-Surveillance Software, and LG's TV Made of Evil and Tears

  1. Norbert Weiner (The Atlantic) — His fears for the future stemmed from two fundamental convictions: We humans can’t resist selfishly misusing the powers our machines give us, to the detriment of our fellow humans and the planet; and there’s a good chance we couldn’t control our machines even if we wanted to, because they already move too fast and because increasingly we’re building them to make decisions on their own. To believe otherwise, Wiener repeatedly warned, represents a dangerous, potentially fatal, lack of humility.
  2. Open Ephys — open source/open hardware tools for neuro research. (via IEEE)
  3. Startups Selling Resistance to Surveillance (Inc) — growing breed of tools working on securing their customers’ communications from interception by competitors and states.
  4. Not-So-Smart TV (TechDirt) — LG’s privacy policy basically says “let us share your viewing habits, browsing, etc. with third parties, or we will turn off the `smart’ features in your smart TV.” The promise of smart devices should be that they get better for customers over time, not better for the vendor at the expense of the customer. See Weiner above.

From the network interface to the database

All systems are distributed systems, and we’re starting to see how they fit into Velocity's themes.


From the beginning, the Velocity Conference has focused on web performance and operations — specifically, web operations. This focus has been fairly narrow: browser performance dominated the discussion of “web performance,” and interactions between developers and IT staff dominated operations.

These limits weren’t bad. Perceived performance really is dominated by the browser — how fast you can get resources (HTML, images, CSS files, JavaScript libraries) over the network to the browser, and how fast the browser can execute those resources. How long before a user stops waiting for your page to load and clicks away? How do you make a page useable as quickly as possible, even before all the resources have loaded? Those discussions were groundbreaking and surprising: users are incredibly sensitive to page speed.

That’s not to say that Velocity hasn’t looked at the rest of the application stack; there’s been an occasional glance in the direction of the database and an even more occasional glance at the middleware. But the database and middleware have, at least historically, played a bit part. And while the focus of Velocity has been front-end tuning, speakers like Baron Schwartz haven’t let us ignore the database entirely. Read more…

Comment: 1