- pineapple.io — attempt to crowdsource rankings for tutorials for important products, so you’re not picking your way through Google search results littered with tutorials written by incompetent illiterates for past versions of the software.
- BBC Forum — American social psychologist Aleks Krotoski has been looking at how the internet affects the way we talk to ourselves. Podcast (available for next 30 days) from BBC. (via Vaughan Bell)
- Why Can’t My Computer Understand Me (New Yorker) — using anaphora as the basis of an intelligence test, as example of what AI should be striving for. It’s not just that contemporary A.I. hasn’t solved these kinds of problems yet; it’s that contemporary A.I. has largely forgotten about them. In Levesque’s view, the field of artificial intelligence has fallen into a trap of “serial silver bulletism,” always looking to the next big thing, whether it’s expert systems or Big Data, but never painstakingly analyzing all of the subtle and deep knowledge that ordinary human beings possess. That’s a gargantuan task— “more like scaling a mountain than shoveling a driveway,” as Levesque writes. But it’s what the field needs to do.
- 507 Mechanical Movements — an old basic engineering textbook, animated. Me gusta.
As companies continue to use crowdsourcing, demand for people who know how to manage projects remains steady
A little over four years ago, I attended the first Crowdsourcing meetup at the offices of Crowdflower (then called Dolores Labs). The crowdsourcing community has grown explosively since that initial gathering, and there are now conference tracks and conferences devoted to this important industry. At the recent CrowdConf1, I found a community of professionals who specialize in managing a wide array of crowdsourcing projects.
Data scientists were early users of crowdsourcing services. I personally am most familiar with a common use case – the use of crowdsourcing to create labeled data sets for training machine-learning models. But as straightforward as it sounds, using crowdsourcing to generate training sets can be tricky – fortunately there are excellent papers and talks on this topic. At the most basic level, before embarking on a crowdsourcing project you should go through a simple checklist (among other things, make sure you have enough scale to justify engaging with a provider).
Beyond building training sets for machine-learning, more recently crowdsourcing is being used to enhance the results of machine-learning models: in active learning, humans2 take care of uncertain cases, models handle the routine ones. The use of ReCAPTCHA to digitize books is an example of this approach. On the flip side, analytics are being used to predict the outcome of crowd-based initiatives: researchers developed models to predict the success of Kickstarter campaigns 4 hours after their launch.
Better Tutorials, Self-Talk, Better AI, and Visualised Mechanics
Good Dev, User-Hostile Patterns, Patent Victories, and Drone History
- What to Look For in Software Dev (Pamela Fox) — It’s important to find a job where you get to work on a product you love or problems that challenge you, but it’s also important to find a job where you will be happy inside their codebase – where you won’t be afraid to make changes and where there’s a clear process for those changes.
- The Slippery Slope to Dark Patterns — demonstrates and deconstructs determinedly user-hostile pieces of software which deliberately break Nielsen’s usability heuristics to make users agree to things they rationally wouldn’t.
- Victory Lap for Ask Patents (Joel Spolsky) — story of how a StackExchange board on patents helped bust a bogus patent. It’s crowdsourcing the prior art, and Joel shows how easy it is.
- The World as Fire-Free Zone (MIT Technology Review) — data analysis to identify “signature” of terrorist behaviour, civilian deaths from strikes in territories the US has not declared war on, empty restrictions on use. Again, it’s a test that, by design, cannot be failed. Good history of UAVs in warfare and the blowback from their lax use. Quoting retired General Stanley McChrystal: The resentment caused by American use of unmanned strikes … is much greater than the average American appreciates. They are hated on a visceral level, even by people who’ve never seen one or seen the effects of one.
In-Browser p2p, Thinking About The Future, Disruptive Tech, and Crowdsourcing Transcription
- ShareFest — peer-to-peer file sharing in the browser. Source on GitHub. (via Andy Baio)
- Media for Thinking the Unthinkable (Bret Victor) — “Right now, today, we can’t see the thing, at all, that’s going to be the most important 100 years from now.” We cannot see the thing. At all. But whatever that thing is — people will have to think it. And we can, right now, today, prepare powerful ways of thinking for these people. We can build the tools that make it possible to think that thing. (via Matt Jones)
- McKinsey Report on Disruptive Technologies (McKinsey) — the list: Mobile Internet; Automation of knowledge work; Internet of Things; Cloud technology; Advanced Robotics; Autonomous and near-autonomous vehicles; Next-generation genomics; Energy storage; 3D Printing; Advanced Materials; Advanced Oil and Gas exploration and recovery; Renewable energy.
- The Only Public Transcript of the Bradley Manning Trial Will be Produced on a Crowd-Funded Typewriter — [t]he fact that a volunteer stenographer is providing the only comprehensive source of information about such a monumental event is pretty absurd.
Quality and security drive adoption, but community is rising fast
I recently talked to two managers of Black Duck, the first company formed to help organizations deal with the licensing issues involved in adopting open source software. With Tim Yeaton, President and CEO, and Peter Vescuso, Executive Vice President of Marketing and Business Development, I discussed the seventh Future of Open Source survey, from which I’ll post a few interesting insights later. But you can look at the slides for yourself, so this article will focus instead on some of the topics we talked about in our interview. While I cite some ideas from Yeaton and Vescuso, many of the observations below are purely my own.
The spur to collaboration
One theme in the slides is the formation of consortia that develop software for entire industries. One recent example everybody knows about is OpenStack, but many industries have their own impressive collaboration projects, such as GENIVI in the auto industry.
What brings competitors together to collaborate? In the case of GENIVI, it’s the impossibility of any single company meeting consumer demand through its own efforts. Car companies typically take five years to put a design out to market, but customers are used to product releases more like those of cell phones, where you can find something enticingly new every six months. In addition, the range of useful technologies—Bluetooth, etc.—is so big that a company has to become expert at everything at once. Meanwhile, according to Vescuso, the average high-end car contains more than 100 million lines of code. So the pace and complexity of progress is driving the auto industry to work together.
All too often, the main force uniting competitors is the fear of another vendor and the realization that they can never beat a dominant vendor on its own turf. Open source becomes a way of changing the rules out from under the dominant player. OpenStack, for instance, took on VMware in the virtualization space and Amazon.com in the IaaS space. Android attracted phone manufacturers and telephone companies as a reaction to the iPhone.
A valuable lesson can be learned from the history of the Open Software Foundation, which was formed in reaction to an agreement between Sun and AT&T. In the late 1980s, Sun had become the dominant vendor of Unix, which was still being maintained by AT&T. Their combination panicked vendors such as Digital Equipment Corporation and Apollo Computer (you can already get a sense of how much good OSF did them), who promised to create a single, unified standard that would give customers increased functionality and more competition.
The name Open Software Foundation was deceptive, because it was never open. Instead, it was a shared repository into which various companies dumped bad code so they could cynically claim to be interoperable while continuing to compete against each other in the usual way. It soon ceased to exist in its planned form, but did survive in a fashion by merging with X/Open to become the Open Group, an organization of some significance because it maintains the X Window System. Various flavors of BSD failed to dislodge the proprietary Unix vendors, probably because each BSD team did its work in a fairly traditional, closed fashion. It remained up to Linux, a truly open project, to unify the Unix community and ultimately replace the closed Sun/AT&T partnership.
Collaboration can be driven by many things, therefore, but it usually takes place in one of two fashions. In the first, somebody throws out into the field some open source code that everybody likes, as Rackspace and NASA did to launch OpenStack, or IBM did to launch Eclipse. Less common is the GENIVI model, in which companies realize they need to collaborate to compete and then start a project.
A bigger pie for all
The first thing on most companies’ minds when they adopt open source is to improve interoperability and defend themselves against lock-in by vendors. The Future of Open Source survey indicates that the top reasons for choosing open source is its quality (slide 13) and security (slide 15). This is excellent news because it shows that the misconceptions of open source are shattering, and the arguments by proprietary vendors that they can ensure better quality and security will increasingly be seen as hollow.
Know Your HTTP, Digital Exploitation, Insecure Webcams, and CS Courses
- Know Your HTTP Posters (GitHub) — A0-posters about the HTTP protocol.
- Crowdserfing — when a large corp uses crowd-sourced volunteering for its own financial gain, without giving back. It offends my sense of reciprocity as well, but nobody is coerced into using Google Maps or contributing data to it. How do we decide what is “right”?
- Exposed Webcam Viewer — hotels in Russia, lobbies in California, and blinking lights in the darkness from all around the world. (via Hacker News)
- Beauty and Joy of Computing — an introductory computer science curriculum developed at the University of California, Berkeley, intended for non-CS majors at the high school junior through undergraduate freshman level. Uses Snap, a web-based implementation of Scratch.
Cite Spam, Astro Science Labs, Citizen Science, and Accelerating Research
- Manipulating Google Scholar Citations and Google Scholar Metrics: simple, easy and tempting (PDF) — scholarly paper on how to citespam your paper up Google Scholar’s results list. Fortunately calling your paper “AAAAAA In-vitro Qualia of …” isn’t one of the winning techniques.
- Seamless Astronomy — brings together astronomers, computer scientists, information scientists, librarians and visualization experts involved in the development of tools and systems to study and enable the next generation of online astronomical research.
- Eye Wire — a citizen science game where you map the 3D structure of neurons.
- Open Science is a Research Accelerator (Nature Chemistry) — challenge was: get rid of this bad-tasting compound from malaria medicine, without raising cost. Did it with open notebooks and collaboration, including LinkedIn groups. Lots of good reflection on advertising, engaging, and speed.
Comms 101, RoboTurking, Geek Tourism, and Implementing Papers
- How to Redesign Your App Without Pissing Everybody Off (Anil Dash) — the basic straightforward stuff that gets your users on-side. Anil’s making a career out of being an adult.
- Clockwork Raven (Twitter) — open source project to send data analysis tasks to Mechanical Turkers.
- Updates from the Tour in China (Bunnie Huang) — my dream geek tourism trip: going around Chinese factories and bazaars with MIT geeks.
- How to Implement an Algorithm from a Scientific Paper — I have implemented many complex algorithms from books and scientific publications, and this article sums up what I have learned while searching, reading, coding and debugging. (via Siah)
MOOCs get the attention, but DIY and peer-to-peer exchange are more fertile grounds for development
Somehow, recently, a lot of people have taken an interest in the broadcast of canned educational materials, and this practice — under a term that proponents and detractors have settled on, massive open online course (MOOC) — is getting a publicity surge. I know that the series of online classes offered by Stanford proved to be extraordinarily popular, leading to the foundation of Udacity and a number of other companies. But I wish people would stop getting so excited over this transitional technology. The attention drowns out two truly significant trends in progressive education: do-it-yourself labs and peer-to-peer exchanges.
In the current opinion torrent, Clay Shirky treats MOOCs in a recent article, and Joseph E. Aoun, president of Northeastern University, writes (in a Boston Globe subscription-only article) that traditional colleges will have to deal with the MOOC challenge. Jon Bruner points out on Radar that non-elite American institutions could use a good scare (although I know a lot of people whose lives were dramatically improved by attending such colleges). The December issue of Communications of the ACM offers Professor Richard A. DeMillo from the Georgia Institute of Technology assessing the possible role of MOOCs in changing education, along with an editorial by editor-in-chief Moshe Y. Vardi culminating with, “If I had my wish, I would wave a wand and make MOOCs disappear.”
There’s a popular metaphor for this early stage of innovation: we look back to the time when film-makers made the first moving pictures with professional performers by setting up cameras before stages in theaters. This era didn’t last long before visionaries such as Georges Méliès, D. W. Griffith, Sergei Eisenstein, and Luis Buñuel uncovered what the new medium could do for itself. How soon will colleges get tired of putting lectures online and offer courses that take advantage of new media? Read more…
3D Printing Booth, Crowdsourcing Nanoscience, Mobile Numbers, and Web Techniques
- 3D Printing Photobooth Opening in Japan (io9) — A technician at the lab will scan your body (much like with early photography, you’ll need to be able to hold a certain pose for 15 minutes) and print out an impressively realistic 3D photo that captures not only your features, but also the basic textures of your clothing and hair. (via Julie Starr)
- Feynman Flowers — crowdsourcing analysis of STM imagery for nanoscale physics research. (via OKFN)
- Mobile Trends — Android on exponential growth vs iOS’s linear growth, and many more data-driven observations. Apple has a mobile product at every $50 price point between $0 and $850.
- The Definitive Guide to Forms-Based Website Authentication (Stack Overflow) — exactly what the title says.