Mike Loukides

Mike Loukides is Vice President of Content Strategy for O'Reilly Media, Inc. He's edited many highly regarded books on technical subjects that don't involve Windows programming. He's particularly interested in programming languages, Unix and what passes for Unix these days, and system and network administration. Mike is the author of System Performance Tuning", and a coauthor of "Unix Power Tools." Most recently, he's been fooling around with data and data analysis, languages like R, Mathematica, and Octave, and thinking about how to make books social.

Mining the social web, again

If you want to engage with the data that's surrounding you, Mining the Social Web is the best place to start.

When we first published Mining the Social Web, I thought it was one of the most important books I worked on that year. Now that we’re publishing a second edition (which I didn’t work on), I find that I agree with myself. With this new edition, Mining the Social Web is more important than ever.

While we’re seeing more and more cynicism about the value of data, and particularly “big data,” that cynicism isn’t shared by most people who actually work with data. Data has undoubtedly been overhyped and oversold, but the best way to arm yourself against the hype machine is to start working with data yourself, to find out what you can and can’t learn. And there’s no shortage of data around. Everything we do leaves a cloud of data behind it: Twitter, Facebook, Google+ — to say nothing of the thousands of other social sites out there, such as Pinterest, Yelp, Foursquare, you name it. Google is doing a great job of mining your data for value. Why shouldn’t you?

There are few better ways to learn about mining social data than by starting with Twitter; Twitter is really a ready-made laboratory for the new data scientist. And this book is without a doubt the best and most thorough approach to mining Twitter data out there. Read more…

Comments: 2

Announcing BioCoder

An O'Reilly newsletter covering the biology revolution and connecting the many people working in DIY bio.

We’re pleased to announce BioCoder, a newsletter on the rapidly expanding field of biology. We’re focusing on DIY bio and synthetic biology, but we’re open to anything that’s interesting.

Why biology? Why now? Biology is currently going through a revolution as radical as the personal computer revolution. Up until the mid-70s, computing was dominated by large, extremely expensive machines that were installed in special rooms and operated by people wearing white lab coats. Programming was the domain of professionals. That changed radically with the advent of microprocessors, the homebrew computer club, and the first generation of personal computers. I put the beginning of the shift in 1975, when a friend of mine built a computer in his dorm room. But whenever it started, the phase transition was thorough and radical. We’ve built a new economy around computing: we’ve seen several startups become gigantic enterprises, and we’ve seen several giants collapse because they couldn’t compete with the more nimble startups.

We’re seeing the same patterns in biology today. You can build homebrew lab equipment for a fraction of the price of commercial equipment; we’re seeing amateurs do meaningful research and experimentation; and we’re seeing new tools that radically drop the cost of experimentation. We’re also seeing new startups that have the potential for changing the economy as radically as the advent of inexpensive computing.

BioCoder is the newsletter of the biology revolution. Read more…

Comments: 3

Genetically modified foods: asking the right questions

Problems with GM foods lie not in genetics, but in the structure of industrial farming.

Monarch Butterfly

Monarch butterfly, photo by Mike Loukides

A while ago, I read an article in Mother Jones: GM Crops Are Killing Monarch Butterflies, After All. Given the current concerns about genetically modified foods, it was predictable — and wrong, in a way that’s important. If you read the article rather than the headline, you’ll find out what was really going on. Farmers planted Monsanto’s Roundup Ready corn and soybeans. These plants have been genetically modified so that they’re not damaged by the weed killer Roundup. Then the farmers doused their fields with heavy applications of Roundup, killing the milkweed on which Monarch caterpillars live. As a result: fewer butterflies.

But that’s really not what the headline said. The GM crops didn’t kill the butterflies — abuse of a herbicide did. It’s very important to distinguish between first order and second order effects. The milkweed would be just as dead if the farmers applied the Roundup directly to the milkweed. And, assuming that the farmers are trying to kill weeds other than milkweed (which only grows at the edges of the field), the caterpillars would survive if farmers applied Roundup more precisely, just to the crops they were trying to protect. Is it safe to eat corn that’s been genetically modified so that it’s Roundup resistant? I have no problem with the genetics; but you might think twice about eating corn that has been doused with a potent herbicide. Do you wash your food carefully? Good.

Read more…

Comments: 20

What is an enterprise, anyway?

However one defines "enterprise," what really matters is an organization's culture.

This post was co-authored by Mike Loukides and Bill Higgins.

Bill Higgins of IBM and I have been working on an article about DevOps in the enterprise. DevOps is mostly closely associated with Internet giants and web startups, but increasingly we are observing companies we lump under the banner of “enterprises” trying — and often struggling — to adopt the sorts of DevOps culture and practices we see at places like Etsy. As we tried to catalog the success and failure patterns of DevOps adoption in the enterprise, we ran into an interesting problem: we couldn’t precisely define what makes a company an enterprise. Without a well understood context, it was hard to diagnose inhibitors or to prescribe any particular advice.

So, we decided to pause our article and turn our minds to the question “What is an enterprise, anyway?” We first tried to define an enterprise based on its attributes, but as you’ll see, these are problematic:

More then N employees
Definitions like this don’t interest us. What changes magically when you cross the line between 999 and 1,000 employees? Or 9,999 and 10,000? Wherever you put the line, it’s arbitrary. I’ll grant that 30-person companies work differently from 10,000 person companies, and that 100-person companies have often adopted the overhead and bureaucracy of 10,000 person companies (not a pretty sight). But drawing an arbitrary line in the sand isn’t helpful.

Read more…

Comments: 7

Shakespeare and the myth of publishing

Reinventing publishing: what can we do now that we're no longer tied to the myth of stable literary objects?

Note: this post started as a Foo Camp 2013 session.

A few weeks ago, Tim O’Reilly sent around a link to Who Edited Shakespeare?, which discussed the editor for the First Folio edition of Shakespeare’s plays. It included a lot of evidence that someone had done a lot of work regularizing spelling and doing other tasks that we’d now assign to a copyeditor or a proofreader, presumably more work than the Folio’s nominal editors, Heminges and Condell, were inclined to do or capable of doing.

It’s an interesting argument that prompted some thoughts about the nature of publishing. The process of editing creates the impression, the mythology, that a carefully crafted, consistent, and stable text exists for these plays, that the plays are static literary objects. We like to think that there is a “good” Shakespeare text, if only we had it: what Shakespeare actually wrote, and what was actually performed on stage. We have a mess of good quarto editions, bad quartos, the First Folio, apocryphal works, and more. Some versions of the plays are significantly longer than others; some scholars believe that we’re missing significant parts of Macbeth (Shakespeare’s shortest tragedy, for which the First Folio is the only source). Perhaps the worst case is Christopher Marlowe’s Doctor Faustus, which is known entirely through two early print editions, one roughly 50% longer than the other.

I’m skeptical about whether the search for a hypothetical authoritative version of Shakespeare’s text is meaningful. Shakespeare’s plays were, first and foremost, plays: they were performances staged before a live audience. If you’ve had any involvement with theater, you can imagine how that goes: “Act III, Scene iv dragged; let’s cut it next time. Act V, Scene i was great, but too short; let’s fill it out some.” The plays, as staged events, were infinitely flexible. In the years after Shakespeare, poor editors have certainly done a lot to mangle them, but I’m sure that Shakespeare himself, as a theater professional and partner in a theater company, was constantly messing around with the text.

Read more…

Comments: 8

Data Science for Business

What business leaders need to know about data and data analysis to drive their businesses forward.

DataScienceForBusinessCoverA couple of years ago, Claudia Perlich introduced me to Foster Provost, her PhD adviser. Foster showed me the book he was writing with Tom Fawcett, and using in his teaching at NYU.

Foster and Tom have a long history of applying data to practical business problems. Their book, which evolved into Data Science for Business, was different from all the other data science books I’ve seen. It wasn’t about tools: Hadoop and R are scarcely mentioned, if at all. It wasn’t about coding: business students don’t need to learn how to implement machine learning algorithms in Python. It is about business: specifically, it’s about the data analytic thinking that business people need to work with data effectively.

Data analytic thinking means knowing what questions to ask, how to ask those questions, and whether the answers you get make sense. Business leaders don’t (and shouldn’t) do the data analysis themselves. But in this data-driven age, it’s critically important for business leaders to understand how to work with the data scientists on their teams. Read more…

Comment

The web performance I want

Cruftifying web pages is not what Velocity is about.

There’s been a lot said and written about web performance since the Velocity conference. And steps both forward and back — is the web getting faster? Are developers using increased performance to add more useless gunk to their pages, taking back performance gains almost as quickly as they’re achieved?

I don’t want to leap into that argument; Arvind Jain did a good job of discussing the issues at Velocity Santa Clara and in a blog post on Google’s analytics site. But, I do want to discuss (all right, flame) about one issue that bugs me.

I see a lot of pages that appear to load quickly. I click on a site, and within a second, I have an apparently readable page.

“Apparently,” however, is a loaded word because a second later, some new component of the page loads, causing the browser to re-layout the page, so everything jumps around. Then comes the pop-over screen, asking if I want to subscribe or take a survey. (Most online renditions of print magazines: THIS MEANS YOU!). Then another resize, as another component appears. If I want to scroll down past the lead picture, which is usually uninteresting, I often find that I can’t because the browser is still laying out bits and pieces of the page. It’s almost as if the developers don’t want me to read the page. That’s certainly the effect they achieve.

Read more…

Comments: 4

On Batteries and Innovation

Despite reports of breakthroughs in battery technology, the hard problems of battery innovation remain hard.

Lately there’s been a spate of articles about breakthroughs in battery technology. Better batteries are important, for any of a number of reasons: electric cars, smoothing out variations in the power grid, cell phones, and laptops that don’t need to be recharged daily.

All of these nascent technologies are important, but some of them leave me cold, and in a way that seems important. It’s relatively easy to invent new technology, but a lot harder to bring it to market. I’m starting to understand why. The problem isn’t just commercializing a new technology — it’s everything that surrounds that new technology.

Take an article like Battery Breakthrough Offers 30 Times More Power, Charges 1,000 Times Faster. For the purposes of argument, let’s assume that the technology works; I’m not an expert on the chemistry of batteries, so I have no reason to believe that it doesn’t. But then let’s take a step back and think about what a battery does. When you discharge a battery, you’re using a chemical reaction to create electrical current (which is moving electrical charge). When you charge a battery, you’re reversing that reaction: you’re essentially taking the current and putting that back in the battery.

So, if a battery is going to store 30 times as much power and charge 1,000 times faster, that means that the wires that connect to it need to carry 30,000 times more current. (Let’s ignore questions like “faster than what?,” but most batteries I’ve seen take between two and eight hours to charge.) It’s reasonable to assume that a new battery technology might be able to store electrical charge more efficiently, but the charging process is already surprisingly efficient: on the order of 50% to 80%, but possibly much higher for a lithium battery. So improved charging efficiency isn’t going to help much — if charging a battery is already 50% efficient, making it 100% efficient only improves things by a factor of two. How big are the wires for an automobile battery charger? Can you imagine wires big enough to handle thousands of times as much current? I don’t think Apple is going to make any thin, sexy laptops if the charging cable is made from 0000 gauge wire (roughly 1/2 inch thick, capacity of 195 amps at 60 degrees C). And I certainly don’t think, as the article claims, that I’ll be able to jump-start my car with the battery in my cell phone — I don’t have any idea how I’d connect a wire with the current-handling capacity of a jumper cable to any cell phone I’d be willing to carry, nor do I want a phone that turns into an incendiary firebrick when it’s charged, even if I only need to charge it once a year.

Read more…

Comments: 4

Networked Things?

The magic starts when household devices can communicate over a network.

Well over a decade ago, Bill Joy was mocked for talking about a future that included network-enabled refrigerators. That was both unfair and unproductive, and since then, I’ve been interested in a related game: take the most unlikely household product you can and figure out what you could do if it were network-enabled. That might have been a futuristic exercise in 1998, but the future is here. Now. And there are few reasons we couldn’t have had that future back then, if we’d have the vision.

So, what are some of the devices that could be Internet-enabled, and what would that mean? We’re already familiar with the Nest; who would have thought even five years ago that we’d have Internet-enabled thermostats?

Read more…

Comments: 2

Phishing in Facebook’s Pond

Facebook scraping could lead to machine-generated spam so good that it's indistinguishable from legitimate messages.

A recent blog post inquired about the incidence of Facebook-based spear phishing: the author suddenly started receiving email that appeared to be from friends (though it wasn’t posted from their usual email addresses), making the usual kinds of offers and asking him to click on the usual links. He wondered whether this was a phenomenon and how it happened — how does a phisherman get access to your Facebook friends?

The answers are “yes, it happens” and “I don’t know, but it’s going to get worse.” Seriously, my wife’s name has been used in Facebook phishing. A while ago, several of her Facebook friends said that her email account had been hacked. I was suspicious; she only uses Gmail, and hacking Google isn’t easy, particularly with two-factor authentication. So, I asked her friends to send me the offending messages. It was obvious that they hadn’t come from my wife’s account; they were Yahoo accounts with her name but an unrecognizable email address, exactly what this blogger had seen.

How does this happen? How can a phisher discover your name and your Facebook friends? I don’t know, but Facebook is such a morass of weird and conflicting security settings that it’s impossible to know just how private or how public you are. If you’ve ever friended people you don’t know (a practice that remains entirely too common), and if you’ve ever enabled visibility to friends of friends, you have no idea who has access to your conversations.

Read more…

Comments: 5