Over the last six months, I’ve had a number of conversations about lab practice. In one, Tim Gardner of Riffyn told me about a gene transformation experiment he did in grad school. As he was new to the lab, he asked two more experienced scientists for their protocol: one said it must be done exactly at 42 C for 45 seconds, the other said exactly 37 C for 90 seconds. When he ran the experiment, Tim discovered that the temperature actually didn’t matter much. A broad range of temperatures and times would work.
In an unrelated conversation, DJ Kleinbaum of Emerald Cloud Lab told me about students who would only use their “lucky machine” in their work. Why, given a choice of lab equipment, did one of two apparently identical machines give “good” results for a some experiment, while the other one didn’t? Nobody knew. Perhaps it is the tubing that connects the machine to the rest of the experiment; perhaps it is some valve somewhere; perhaps it is some quirk of the machine’s calibration.
The more people I talked to, the more stories I heard: labs where the experimental protocols weren’t written down, but were handed down from mentor to student. Labs where there was a shared common knowledge of how to do things, but where that shared culture never made it outside, not even to the lab down the hall. There’s no need to write it down or publish stuff that’s “obvious” or that “everyone knows.” As someone who is more familiar with literature than with biology labs, this behavior was immediately recognizable: we’re in the land of mythology, not science. Each lab has its own ritualized behavior that “works.” Whether it’s protocols, lucky machines, or common knowledge that’s picked up by every student in the lab (but which might not be the same from lab to lab), the process of doing science is an odd mixture of rigor and folklore. Everybody knows that you use 42 C for 45 seconds, but nobody really knows why. It’s just what you do.
Despite all of this, we’ve gotten fairly good at doing science. But to get even better, we have to go beyond mythology and folklore. And getting beyond folklore requires change: changes in how we record data, changes in how we describe experiments, and perhaps most importantly, changes in how we publish results.There are many variables in a science experiment, many more than can be collected in a lab notebook. That’s the problem, but it’s also the solution. In the 1950s, scientists were limited in the data they could record. They had that notebook, and little more, and their analytical tools were limited to what could be done on a slide rule or a mechanical desktop calculator. (Indeed, the historical origin of statistics has nothing to do with “big data.” Statistics developed as a way of understanding variations in results when data was scarce, where a few dozen data points was a lot.) In the 70s, computers started to make their way into laboratories, and we started to build interfaces to collect data from experiments. We still couldn’t collect much data, by modern standards: a computer with 256K of RAM and 2 MB of disk space was still a big deal. But it was still more than you could scribble down in a notebook.
30-odd years later, we have experiments that throw off terabytes and even petabytes of data per day, and we have the tools to analyze that data. But we have problems with reproducibility, arguably more problems than we had years ago. I don’t believe the problem is that we don’t sacrifice enough chickens. Rather, we haven’t been radical enough about what data to collect. We’ve automated some of the data collection, but we still don’t collect all (or even most) of the data that’s potentially available: intermediate results from each step, calibration data from each piece of equipment, detailed descriptions of the process, and the configuration of every piece of equipment. Kleinbaum told me that some experiments are sensitive to whether you use glass or plastic test tubes. That makes sense: it’s easy to scratch plastics, and microbes can hide in scratches that are invisible to the human eye. Plastics can also release trace amounts of gasses long after they’re manufactured, or absorb some of the compounds you want to measure; for some experiments, that matters, for others, it doesn’t. Few scientists would consider the test tubes used, the pipettes, and so on, as part of the experimental data. That must change if we’re going to solve our problems with reproducibility.
In addition to the data, we also have to record exactly how experiments are performed, in detail. Everybody I talked to had stories about protocols that were part of their labs’ oral culture: you did things a certain way because that’s how you did it. It worked, no need to belabor the point. A recent article asks, in frustration, Never mind the data, where are the protocols? Having the data means little if you don’t have the methods by which the data was generated. The best way to record the protocols isn’t by scribbling in lab notebooks (or their virtual equivalents), but by implementing the experiment in a high-level programming language. The program is then a complete description of how the experiment was performed, including setup. The importance of this step isn’t that the experiment can be run on lab robots, though that is important in itself; programming forces you to describe the process precisely and completely, in a standardized language that is meaningful in different contexts, different labs. Thinking of an experiment as a program also allows the use of design tools that make it easier to think through the entire process, and to incorporate standard protocols (you might think of them as subassemblies or subroutines) from other experiments, both in your lab and in others.
Thinking of an experiment as a program allows the use of design tools that make it easier to think through the entire process.
We’re still missing an important component. Science has always been about sharing, about the flow of ideas. For the first few centuries of scientific research, publishing meant paper journals, and those are (by nature) scarce commodities. You can’t publish a terabyte of data in a journal, nor can you publish a long, detailed, and extremely precise description of an experiment. You can’t publish the software you used to analyze the data. When you’re limited to paper, about all that makes sense is to publish a rough description of what you did, some graphs of the data, and the result. As our experiments and analyses get more complex, that’s no longer enough. In addition to collecting much more data and describing the experiments as detailed programs, we need ways to share the data, the experimental processes, and the tools to analyze that data. That sharing goes well beyond what traditional scientific journals provide, though some publications (notably F1000Research and GigaScience) are taking steps in this direction.
To understand what we need to share, we need to look at why we’re sharing. We’re not sharing just because sharing is a good in itself, and it’s what our nursery school teachers encouraged us to do. Sharing is central to the scientific enterprise. How can anyone reproduce a result if they don’t know what you’ve done? How can they check your data analysis without your data or your software? Even more importantly, how can they look at your experimental procedures and improve them? And without sharing, how can you incorporate protocols developed by other researchers? All science builds on other science. For 400 or so years, that building process was based on sharing results: the limitations of print journals and face-to-face meetings made it difficult to do more.
Fortunately, we’re no longer bound by paper-based publishing. Amazon Web Services makes it possible to build huge, publicly accessible data archives at relatively low cost. Many large datasets are becoming available: for example, the Protein Data Bank, and the raw data from the Large Hadron Collider. Figshare is a cloud-based service for managing and publishing scientific datasets. Sharing protocols and the software used for data analysis is a different problem, but it’s also been solved. GitHub is widely used by software engineers, and provides an excellent model for sharing software in ways that allow others to modify it and use it for their own work. When we have languages for describing experiments precisely (Antha is a very promising start), using tools like GitHub to share protocols will become natural.
That’s how we’re going to get beyond lab folklore: by looking into other labs and seeing how they’ve solved their problems.
Once the data and protocols are available, they have to be searchable. We tend to view scientific results as wholes, but any experiment is made up of many small steps. It’s possible that the most useful part of any experiment isn’t the “result” itself, but some intermediate step along the way. How did someone run a particular reaction, and how does that relate to my experiment? I might have no interest in the overall result, but information about some particular part of the experiment and its intermediate results might be all I need to solve an entirely different problem. That’s how we’re going to get beyond lab folklore: by looking into other labs and seeing how they’ve solved their problems. With many scientists running the same reaction in different contexts, collecting and publishing the data from all their intermediate steps, it should be possible to determine why a specific step failed or succeeded. It should be possible to investigate what’s different about my experiment: what changes (intentional or not) have I made to the protocol?
Titus Brown’s work on data-driven discovery proposes to do a lot of what I’m talking about, and more; it’s a glimpse at the future of science. Titus is building a distributed database for scientific data, where data can be stored, tagged, and searched, even prior to publication. Brown notes: “We need to incentivise pre-publication sharing by making it useful to share your data. We can do individual analyses now, but we’re missing the part that links these analyses to other data sets more broadly.” Helping scientists to analyze their own data is important, but the goals are much bigger: analysis across data sets, large-scale data mining, “permissionless innovation.” This is what the future of science will look like if we’re bold enough to look beyond centuries-old models.
The way we do science is changing. Experiments are getting much more complex; the data they produce is growing by orders of magnitude; and the methods that worked 50 or 60 years ago, when we barely knew what DNA was, are far from adequate now that we’re learning how to write the genetic code. Fortunately, we know how to get beyond mythology and folklore. Many of the tools we need exist, and the ones that don’t yet exist are being built. We’re creating a new generation of scientists who can collect all the data, share everything, and build the tools needed to facilitate that sharing. And when we’ve done that, we will have achieved a new scientific revolution — or, more precisely, fulfilled the promise of the first one.