- Pin: A Dynamic Binary Instrumentation Tool — a dynamic binary instrumentation framework for the IA-32 and x86-64 instruction-set architectures that enables the creation of dynamic program analysis tools. Some tools built with Pin are Intel Parallel Inspector, Intel Parallel Amplifier and Intel Parallel Advisor. The tools created using Pin, called Pintools, can be used to perform program analysis on user space applications in Linux and Windows. As a dynamic binary instrumentation tool, instrumentation is performed at run time on the compiled binary files. Thus, it requires no recompiling of source code and can support instrumenting programs that dynamically generate code.
- Lasers Bringing Down Drones (Wired) — I’ve sat on this for a while, but it is still hypnotic. Autonomous attack, autonomous defence. Pessimist: we’ll be slaves of the better machine learning algorithm. Optimist: we can make love while the AIs make war.
- Advice on Rewriting It From Scratch — every word is true. Over my career, I’ve come to place a really strong value on figuring out how to break big changes into small, safe, value-generating pieces. It’s a sort of meta-design — designing the process of gradual, safe change.
- Creating Gmail Inbox Statistics Reports — shows how to setup gmail to send you an email at the beginning of each month showing statistics for the previous month, such as the number of emails you received, the top 5 to whom you sent email, the top 5 from whom you received email, charts on your daily usage.
Binary Instrumentation, Drone-Laser Warfare, Rocking the Rewrite, and Quantified Inbox
Nigel Shadbolt on AI, ODI, and how personal, open data could empower consumers in the 21st century.
After years of steady growth, open data is now entering into public discourse, particularly in the public sector. If President Barack Obama decides to put the White House’s long-awaited new open data mandate before the nation this spring, it will finally enter the mainstream.
As more governments, businesses, media organizations and institutions adopt open data initiatives, interest in the evidence behind release and the outcomes from it is similarly increasing. High hopes abound in many sectors, from development to energy to health to safety to transportation.
“Today, the digital revolution fueled by open data is starting to do for the modern world of agriculture what the industrial revolution did for agricultural productivity over the past century,” said Secretary of Agriculture Tom Vilsack, speaking at the G-8 Open Data for Agriculture Conference.
As other countries consider releasing their public sector information as data and machine-readable formats onto the Internet, they’ll need to consider and learn from years of effort at data.gov.uk, data.gov in the United States, and Kenya in Africa.
One of the crucial sources of analysis for the success or failure of open data efforts will necessarily be research institutions and academics. That’s precisely why research from the Open Data Institute and Professor Nigel Shadbolt (@Nigel_Shadbolt) will matter in the months and years ahead.
In the following interview, Professor Shadbolt and I discuss what lies ahead. His responses were lightly edited for content and clarity.
In a conversation with Q Ethan McCallum (who should be credited as co-author), we wondered how to evaluate data science groups. If you’re looking at an organization’s data science group from the outside, possibly as a potential employee, what can you use to evaluate it? It’s not a simple problem under the best of conditions: you’re not an insider, so you don’t know the full story of how many projects it has tried, whether they have succeeded or failed, relations between the data group, management, and other departments, and all the other stuff you’d like to know but will never be told.
Our starting point was remote: Q told me about Tyler Brulé’s travel writing for Financial Times (behind a paywall, unfortunately), in which he says that a club sandwich is a good proxy for hotel quality: you go into the restaurant and order a club sandwich. A club sandwich isn’t hard to make: there’s no secret recipe or technique that’s going to make Hotel A’s sandwich significantly better than B’s. But it’s easy to cut corners on ingredients and preparation. And if a hotel is cutting corners on their club sandwiches, they’re probably cutting corners in other places.
This reminded me of when my daughter was in first grade, and we looked (briefly) at private schools. All the schools talked the same talk. But if you looked at classes, it was pretty clear that the quality of the music program was a proxy for the quality of the school. After all, it’s easy to shortchange music, and both hard and expensive to do it right. Oddly enough, using the music program as a proxy for evaluating school quality has continued to work through middle school and (public) high school. It’s the first thing to cut when the budget gets tight; and if a school has a good music program with excellent teachers, they’re probably not shortchanging the kids elsewhere.
How does this connect to data science? What are the proxies that allow you to evaluate a data science program from the “outside,” on the information that you might be able to cull from company blogs, a job interview, or even a job posting? We came up with a few ideas:
- Are the data scientists simply human search engines, or do they have real projects that allow them to explore and be curious? If they have management support for learning what can be learned from the organization’s data, and if management listens to what they discover, they’re accomplishing something significant. If they’re just playing Q&A with the company data, finding answers to specific questions without providing any insight, they’re not really a data science group.
- Do the data scientists live in a silo, or are they connected with the rest of the company? In Building Data Science Teams, DJ Patil wrote about the value of seating data scientists with designers, marketers, with the entire product group so that they don’t do their work in isolation, and can bring their insights to bear on all aspects of the company.
- When the data scientists do a study, is the outcome predetermined by management? Is it OK to say “we don’t have an answer” or to come up with a solution that management doesn’t like? Granted, you aren’t likely to be able to answer this question without insider information.
- What do job postings look like? Does the company have a mission and know what it’s looking for, or are they asking for someone with a huge collection of skills, hoping that they will come in useful? That’s a sign of data science cargo culting.
- Does management know what their tools are for, or have they just installed Hadoop because it’s what the management magazines tell them to do? Can managers talk intelligently to data scientists?
- What sort of documentation does the group produce for its projects? Like a club sandwich, it’s easy to shortchange documentation.
- Is the business built around the data? Or is the data science team an add-on to an existing company? A data science group can be integrated into an older company, but you have to ask a lot more questions; you have to worry a lot more about silos and management relations than you do in a company that is built around data from the start.
Coming up with these questions was an interesting thought experiment; we don’t know whether it holds water, but we suspect it does. Any ideas and opinions?
China Threat, China Opportunity, Open Source Sustainability, and SQL for Cohort Analysis
- China = 41% of World’s Internet Attack Traffic (Bloomberg) — numbers are from Akamai’s research. Verizon Communications said in a separate report that China accounted for 96 percent of all global espionage cases it investigated. One interpretation is that China is a rogue Internet state, but another is that we need to harden up our systems. (via ZD Net)
- Open Source Cannot Live on Donations Alone — excellent summary of some of the sustainability questions facing open source projects.
- China Startups: The Gold Rush (Steve Blank) — dense fact- and insight-filled piece. Not only is the Chinese ecosystem completely different but also the consumer demographics and user expectations are equally unique. 70% of Chinese Internet users are under 30. Instead of email, they’ve grown up with QQ instant messages. They’re used to using the web and increasingly the mobile web for everything, commerce, communication, games, etc. (They also probably haven’t seen a phone that isn’t mobile.) By the end of 2012, there were 85 million iOS and 160 million Android devices in China. And they were increasing at an aggregate 33 million IOS and Android activations per month.
- Calculating Rolling Cohort Retention with SQL — just what it says. (via Max Lynch)
If we want kids to aspire to become scientists and technologists, celebrate academic achievement like athletics and celebrity.
There are few ways to better judge a nation’s character than to look at how its children are educated. What values do their parents, teachers and mentors demonstrate? What accomplishments are celebrated? In a world where championship sports teams are idolized and superstar athletes are feted by the media, it was gratifying to see science, students and teachers get their moment in the sun at the White House last week.
“…one of the things that I’m concerned about is that, as a culture, we’re great consumers of technology, but we’re not always properly respecting the people who are in the labs and behind the scenes creating the stuff that we now take for granted,” said President Barack Obama, “and we’ve got to give the millions of Americans who work in science and technology not only the kind of respect they deserve but also new ways to engage young people.”
An increasingly fierce global competition for talent and natural resources has put a premium on developing scientists and engineers in the nation’s schools. (On that count, last week, the President announced a plan to promote careers in the sciences and expand federal and private-sector initiatives to encourage students to study STEM.
“America has always been about discovery, and invention, and engineering, and science and evidence,” said the President, last week. “That’s who we are. That’s in our DNA. That’s how this country became the greatest economic power in the history of the world. That’s how we’re able to provide so many contributions to people all around the world with our scientific and medical and technological discoveries.”
I just read a Forbes article about Glass, talking about the split between those who are “sure that it is the future of technology, and others who think society will push back against the technology.”
I don’t see this as a dichotomy (and, to be fair, I’m not sure that the author does either). I expect to see both, and I’d like to think a bit more about what these two apparently opposing sides mean.
Push back is inevitable. I hope there’s a significant push back, and that it has some results. Not because I’m a Glass naysayer, but because we, as technology users, are abused so often, and push back so weakly, that it’s not funny. Facebook does something outrageous; a few technorati whine; they add option 1023 to their current highly intertwined 1022 privacy options that have been designed so they can’t be understood or used effectively; and sooner or later, it all dies down. A hundred fifty users have left Facebook, and half a million more have joined. When Apple puts another brick in their walled garden, a few dozen users (myself included) bitch and moan, but does anyone leave? Personally, I’m tired of getting warnings whenever I install software that doesn’t come from the Apple Store (I’ve used the Store exactly twice), and I absolutely expect that a not-too-distant version of OS X won’t allow me to install software from “untrusted” sources, including software I’ve written. Will there be push back? Probably. Will it be effective? I don’t know; if things go as they are now, I doubt it.
There will be push back against Glass; and that’s a good thing. I think Google, of all the companies out there, is most likely to listen and respond positively. I say that partly because of efforts like the Data Liberation Front, and partly because Eric Schmidt has acknowledged that he finds many aspects of Glass creepy. But going beyond Glass: As a community of users, we need to empower ourselves to push back. We need to be able to push back effectively against Google, but more so against Apple, Facebook, and many other abusers of our data, rather than passively accept the latest intrusion as an inevitability. If Glass does nothing more than teach users that they can push back, and teach large corporations how to respond constructively, it will have accomplished much.
Is Glass the future? Yes; at least, something like Glass is part of the future. As a species, we’re not very good at putting our inventions back into the box. About three years ago, there was a big uptick in interest in augmented reality. You probably remember: Wikitude, Layar, and the rest. You installed those apps on your phone. They’re still there. You never use them (at least, I don’t). The problem with consumer-grade AR up until now has been that it was sort of awkward walking around looking at things through your phone’s screen. (Commercial AR–heads-up displays and the like–is a completely different ball game.) Glass is the first attempt at broadly useful platform for consumer AR; it’s a game changer.
Could Glass fail? Sure; I know more failed startups than I can count where the engineers did something really cool, and when they released it, the public said “what is that, and why do you think we’d want it?” Google certainly isn’t immune from that disease, which is endemic to an engineering-driven culture; just think back to Wave. I won’t deny that Google might shelve Glass if they consider unproductive, as they’ve shelved many popular applications. But I believe that Google is playing long-ball here, and thinking far beyond 2014 or 2015. In a conversation about Bitcoin last week, I said that I doubt it will be around in 20 years. But I’m certain we will have some kind of distributed digital currency, and that currency will probably look a lot like Bitcoin. Glass is the same. I have no doubt that something like Glass is part of our future. It’s a first, tentative, and very necessary step into a new generation of user interfaces, a new way of interacting with computing systems and integrating them into our world. We probably won’t wear devices around on our glasses; it may well be surgically implanted. But the future doesn’t happen if you only talk about hypothetical possibilities. Building the future requires concrete innovation, building inconvenient and “creepy” devices that nevertheless point to the next step. And it requires people pushing back against that innovation, to help developers figure out what they really need to build.
Glass will be part of our future, though probably not in its current form. And push back from users will play an essential role in defining the form it will eventually take.