• Print

Everything you needed to know about human-created life forms but were afraid to ask

One of the great pleasures of being involved with O’Reilly Media is learning from the many fascinating people who get involved with the company on one level or another. They’re Friends of O’Reilly, or Foos. We have occasional get-togethers with Foos, our own Nat Torkington has taken the concept to New Zealand, and we have one — on the social graph (see coverage in the next Release 2.0) — coming up this very weekend. While the Foo events are quite off-the-record, the work Foos do is very much public. So we’d like to share with you some of what we’re learning. Over the next few days, we’re going to use this blog to introduce you to one Foo in particular, synthetic biology pioneer Drew Endy. This multi-part profile of Drew and his work is, appropriately, written by another Foo, Quinn Norton, who will be talking about body hacking at ETech in March.–Jimmy Guterman

HelloWorld.jpg

Dr. Drew Endy tends to fidget. He motions frantically when he’s trying to get something across. “It’s hard because we’ve never made it simple,” he explains with exasperation. Endy, a professor at MIT until the end of the school year (he’s headed to Stanford), engineers new life forms. He’s spent his life doing the hard work of bending the complexity of DNA to his will.

And he’s determined to make it simple for you.

Drew Endy is a leading star in a field that’s emerging to be the biggest thing since Walter Brooke suggested to Dustin Hoffman he should think about plastics. He’s a synthetic biologist, a group of scientists and engineers that take microbes with familiar names like E. coli and yeast and make them do previously unimagined things.

Also: Dr. Endy explains what synthetic biology is. (mp3, 5.7mg)

Synthetic biology is next-generation biotech. Over the past 30 years, genetic engineering has laid the groundwork for what synthetic biology is and will be. All the important innovations of genetic engineering are put to work with newer techniques. What makes synthetic biology more than its predecessor is the ability to write DNA cheaply and easily. After designing a sequence, the genetic engineer can mail it to a vendor that will build the base pairs and overnight it back to them.

Sequencing, or reading out DNA, was once the purview of, at the very least, grad students. Now it can be accomplished by minimally trained unskilled labor. Will DNA writing go the same way DNA reading has gone? Probably not as much, but the price of synthesizing a base pair has lowered 16 fold in the last five years according to Endy.

Synthetic biology doesn’t change the goals of biotech: medical applications, environmental remediation, biology based manufacture, etc. But it brings them closer, and adds more possibilities to the pile.

So what does a future of human-built biology look like? The obvious ideas are the ones researched now institutionally. It doesn’t take much imagination to see that a great mover in this field will be pharmaceuticals, and the medical concerns that drive the healthcare industry. We will likely see progress towards biological agents for pollution remediation, drug manufacture, and nanomaterials. Many of these are not only in the works, but on the verge of entering the market. After that, a little imagination goes a long way.

The holy grail right now is alternative fuel production. Petroleum’s supply and environmental problems might not dog an organism custom designed to get from the sun’s energy to a liquid we can stick in our vehicles. If geneticists can produce a viable replacement for petroleum, there’s a mint to be made even if the organism goes off patent in 20 years. J. Craig Venter’s institute and his company, Synthetic Genomics, are particularly geared to this goal. The institute receives its research money from the U.S. Department of Energy as well as Synthetic Genomics. It’s still a ways off. The institute has yet to complete its first fully synthetic organism of any kind, much less one that makes gas. With genes already modified to produce drugs, and the attention paid to fuel, a wide array of other products are candidates to grow instead of fabricate. Synthetic biology is very serious business.

Tremendous minds and piles of money are pouring into the potential organisms, and almost any one of them could easily payback that investment if successful. Drew Endy, despite is in-demand talents isn’t part of any of that.

What makes Drew Endy’s work unique in his field is what he wants to do with it, not the research itself. He wants to modularize DNA into something like a programming language. Then he wants to give it away.

Tomorrow: How can you make anyone a genetic engineer?

Wondering what that “Hello world” image is doing at the top of the post? It’s synthetic biology in action: Students at the University of Texas re-engineered E. coli to be photosensitive, like photographic paper. Their first message? The programmer’s traditional. It’s published here courtesy of Jeff Tabor and Randy Rettberg.

tags:
  • http://dasht-exp-1a.com Thomas Lord

    My comment here is more position statement than argument for a position.

    Drew’s analogies to programming languages are misleading, at best. Rather poor science abounds in genomics and synthetic biology. Oh, the chemistry and physics are fine and a few of the more complicated results are worked out nicely. But these elements are abused in the bulk of even the most reputable contemporary favorites in synthbio and genomics. This does not hold back the Industrialization, however, because the cooks are having alchemical successes. These are cooks, not philosophers. They can surf a “feels like a good idea” wave just fine — it just isn’t very good science.

    That said:

    The invisible hand is having a spasm. Fundamentally irrational trade, though optimal in some narrow perspective, is driving the exponential economic growth in these fields. That growth is why, of course, you can mail order custom DNA (or, if you are a real lab, just buy or build the machines that make it for yourself).

    I say that this market condition is a “spasm” because it is an investment bubble. Nearly none of all this spending has shown any signs at all of paying for itself. At the same time, nearly all of it has risen the tide of risk on *all* investments.

    I’ve met with and worked for two of the top labs in these fields. Mutually dissatisfying experiences in both cases, I think. Blame me, blame them, whatever you like. The point is, I got to poke around and talk to people, and ask what they’re working on, and why, and observe how they work, and so forth. My position:

    They have way too much money. They don’t know what they are doing. They are obsessed with private funding. They organize all of their operational details around entirely fictional assumptions about the potential for valuable “intellectual property”. They are in a bad way because their main competition right now is to raise more private money than the next guy and spend it faster and the science is laying by the wayside and the business bets are almost across the board — sucker bets.

    If I may use the dreaded “C” word, the capitalist class are idiots and dupes in this area.

    Someone said “synthbio is the ultimate nanotech” or words to that effect. That is exactly true — in ways we don’t quite understand (yet). There is hard, important, science to be done — it isn’t getting done because these labs have too much money.

    I propose a new focus. There is precisely one project in the synthbio and genomics fields that should be the focus of everyone engaged in the project: sustainable, environmentally conservative energy production. Set up some labs in various desolate deserts. Equip them with ubiquitous 24-hour web-cams. And work on energy production. Nothing else matters. It is a better investment for the investment class to pick their money up off the synthbio and genomics “IP” table and burn it than to keep going as they are. It is a better bet still for them to invest 50/50 in private energy delivery and public research into production.

    -t

  • http://openwetware.org/wiki/Endy_Lab Drew Endy

    Hi Thomas,

    You started your comment with, “Drew’s analogies to programming languages are misleading, at best.” If you could provide an example of what you mean here, I would be grateful and could also attempt to respond. Otherwise, I don’t know what to make of your comment.

    Later on, you wrote, “They have way too much money. They don’t know what they are doing. They are obsessed with private funding.” We’ve not met. My lab’s annual research budget is well under one million dollars per year, and the not-for-profit I’ve been boot-strapping for the last three years (www.biobricks.org) has an operating budget of under thirty thousand dollars per year. All our support comes from private donations or public research funds. We could smartly spend more money if we had it. I appreciate your observation that there is a decent amount of money being spent on bioenergy all of a sudden (whether or not this is actually something I would recognize as synthetic biology remains to be seen, but you have to start somewhere).

    So far as your closing suggestion, I would suspect that many of the individuals you are criticizing in the abstract would argue that this is what they are trying to do. But, it is hard for me to know, because your comments are not specific.

    Be great, Drew

  • http://searchengines.wordpress.com/ Search‚óÜ Engines Web

    When one factors in the merciless trial and error process of evolution, that has taken billions of years to produce the balance life at the present – how does one respond to potential concerns about Murphy’s law and the potential disastrous consequences from a few mistakes in judgment.

    This is obviously the science of the future – however the concerns over the need for continuous checks and balances and complete transparency are not unmerited considering what potential dangers could ultimately occurr.

  • Alex Tolley

    Having worked with biologists inserting and deleting genes in yeast, or even trying to grow mammalian cells in reactors, it is obvious that these phenotypes are not very stable and quickly diverge.

    I’d be interested to know if the synthetic biology could be made more reproducible using the minimal genome organism approach, rather than using off-the-shelf organisms like e. coli and yeast.

    A stable, living platform that acts as a more reliable operating system to host the genomic programs might offer a better path for the industrial developments being sought.

  • http://dasht-exp-1a.com Thomas Lord

    Hi Drew. Thank you for responding.

    You asked about my position that “Drew’s analogies to
    programming languages are misleading, at best.” I’ll answer in
    general terms and then with a (more or less) specific example.

    A programming language has two “parts” of note, usually called a
    “semantics” and a “syntax”. The semantics give a definite (not
    necessarily deterministic) mathematical model of a class of
    possible computations. Every program “means something” and what
    it means is given with precision by the semantics of the
    language. The syntax is, of course, a notation for humans. It
    is a way to express constructs in the semantic model.

    One thing that means is that if I write a program, and give
    it to a computer to run, then what the computer does next
    will either be right or wrong. It either faithfully executes
    the program and produces exactly a behavior described by the
    semantic model, or the computer has a bug and runs my program
    incorrectly — producing some behavior or output that is
    objectively wrong.

    I hope you will agree that gene expression, gene interaction,
    genome mutation, metabolism, and environmental feedback are
    all examples of areas where we have only a feint grasp on
    the meaning of genes. There are some details we know
    cold but mostly we don’t even know how to formulate the right
    questions yet. If there is analogy to the conventional
    history of science, perhaps it is to the earliest days of
    chemistry, where most practitioners were alchemists using
    fanciful theories to encode accidentally discovered
    pretty-repeatable recipes, while a few have moved on to discover
    conservation properties, have noticed some properties of
    oxidation, but have not quite yet worked out even valence.

    And that is why the programming language analogy is
    misleading, at best: there is no true semantic model there
    for a programming language to use.

    What that means is that when you make something that looks and
    smells like a synthbio programming language that it must, in
    fact, be something else. So what is it? Well, it is a
    true programming language in this sense: it is a way to share
    “programs” that describe lab procedures. Executing a
    hypothetical synthesis program literally means carrying out a
    series of steps like picking a host colony to modify, generating
    the plasmids, etc. But, that is economic programming, not gene
    programming. It is a standard for transactions that doesn’t
    itself give us any insight as to the meaning of the organisms
    thus created.

    A “programming language for X” where “X” is not a computation
    with well-modeled semantics is a tempting idea for many values
    of “X”. I think when you look at what people mean, though,
    they mostly mean that they want recipes and commodities, not
    models of what they’re actually doing. To a capitalist,
    a program is a program because you can hire programmer A to
    write one part, buy a different part from programmer B,
    and have programmer C combine them all with banal, mostly
    reliable reproducibility.

    It seems to me that, in contrast, synthetic biology wants
    to be wisely deployed industrially with an attitude of
    extreme skepticism. That is, that something like a
    synthbio “programming language” reveals a recipe for a
    desired machine is no good reason to trust the programming
    language or therefore set about constructing it. Rather,
    because of the risks and rewards, *each* industrial-scale
    deployment deserves skeptical and adversarial examination
    from as many different angles as we can think of. (Hence,
    my suggestion to put up some labs deep in some deserts,
    equipping them with ubiquitous public surveillance.)
    In financing terms, this extreme skepticism might be justifiable
    as an appropriate tactic for mitigating correspondingly
    large risks.

    On money: I found that I couldn’t spit without hitting some
    student or researcher who was working on a scheme –
    conversations often turned, for example, to who was able to get
    a meeting with which VCs. Students, in particular, often seemed
    more interested in making money on the margins of the science
    than by doing science (though my bias here is to have met mostly
    students with IT experience and interests). Nor could I do any
    work without spending more than 10% of the time discussing or
    worrying about researcher aspirations to develop “IP”. Nor,
    where I was involved in the process, did I see money being
    carefully spent with focused purpose (a particularly
    uncomfortable situation for a vendor to find himself in because
    with poor focus from the customer, there is no good definition
    of satisfying the customer). I perceived a wide-spread
    attitude that all of this was just “normal” for university-based
    research. It is not. This is recent. I have begun to think
    it is generational.

    There is cynicism and resignation on questions of risks.
    Nobody seems to be actually *measuring* the environmental
    impact of these labs. Controls over potentially hazardous
    materials are observably lax. A typical response from a
    student when queried about these kinds of things was to tell
    stories about how only a few years ago the protocols at his
    former school included discarding quite toxic materials
    down the direct drain to the nearby river — oops.

    Personally, I think the social interest may well lie in
    removing all of the governmentally granted IP protections
    and, at least domestically, mandating transparency and
    responding to what is discovered with regulation. Economic
    incentives should be on particular outcomes, most especially
    energy production.

    Finally, you wrote:

    So far as your closing suggestion, I would suspect that many of
    the individuals you are criticizing in the abstract would argue
    that this is what they are trying to do. But, it is hard for me
    to know, because your comments are not specific.

    You refer, I assume, to my suggestion
    that investors either take to burning the
    money they’d otherwise spend on synthbio IP or split 50/50
    into private power transmission and public energy production
    technology. (Perhaps I should have made it 33/33/33 adding in
    public/private conservation.)

    The labs are distracted from science by money. The money
    is in pursuit of IP promises that are at best poorly secured
    by treaty and, anyway, high controversial: there is no reason
    to believe they will hold up. The form that industrialization
    is taking is driving too many researchers into the field.
    The form that the industrialization is taking is also
    exponentiating risk
    to large swaths of the biosphere. I see no sense in which
    this makes a good investment: it adds to the risk of every
    single other investment, significantly.

    Public collaboration on energy production is an obvious
    priority, a great promise of the field, as far as I can tell,
    a way to put the focus back on science, and a way to organize
    investment to these aims.

    I think you are right that this is what some of the field’s
    leaders would argue they are trying to do. EBI, for example,
    talks of a “crash program” basically — a new Manhatten or
    moon shot. I greatly respect that. They have the right
    sentiment and have created momentum. I’m saying: take it
    up a few notches.

    -t

  • steve

    Another great series of comments from Thomas Lord. Please, make him a front page poster.

  • http://openwetware.org/wiki/Endy_Lab Drew Endy

    Hi Thomas,

    You wrote, “One thing that means is that if I write a program, and give it to a computer to run, then what the computer does next will either be right or wrong. It either faithfully executes the program and produces exactly a behavior described by the semantic model, or the computer has a bug and runs my program incorrectly — producing some behavior or output that is objectively wrong.”

    This may be true of computers now but was not true when computers were first constructed. Importantly, the path by which we learned how to engineer computers and software that work reliably was to build such systems.

    Next, you wrote, “I hope you will agree that gene expression, gene interaction, genome mutation, metabolism, and environmental feedback are all examples of areas where we have only a feint grasp on the meaning of genes.”

    I strongly disagree with you here (see below).

    “There are some details we know cold but mostly we don’t even know how to formulate the right questions yet.”

    Again, I disagree. You’ve been talking to the wrong people, apparently.

    “If there is analogy to the conventional history of science, perhaps it is to the earliest days of chemistry, where most practitioners were alchemists using fanciful theories to encode accidentally discovered pretty-repeatable recipes, while a few have moved on to discover conservation properties, have noticed some properties of oxidation, but have not quite yet worked out even valence. And that is why the programming language analogy is misleading, at best: there is no true semantic model there for a programming language to use.”

    Again, I really do not agree with you here. There are large numbers (thousands) of genetic functions that execute reliably, and we understand these functions well enough to move them from one organism to the next, and do so every day with success. Fluorescent proteins, self assembling 50nm diameter gas impermeable protein shells, engineered RNA switches that implement Boolean logic, and so on. Moreover, there are conserved mechanics for DNA replication, mRNA transcription, protein translation, mRNA and protein degradation, et cetera.

    Yes, there are large gaps in our understanding of how natural biological systems work at the molecular scale (e.g., I would not claim that we have a complete physical model for how any natural cell fate selection system works); yes, there are also many engineering challenges to address, such as how to enable reliable functional composition across a collection of genetic functions collected from evolutionarily distant organisms. Such questions are well understood by the folks who are actually working to answer them.

    You are making sweeping and critical statements across several fields of science and engineering on the basis of limited and, as you described, unsuccessful interactions with two research groups. You also state that the research environments you were exposed to did not follow required laboratory safety procedures or training. If so, I would ask that you report any issues to the relevant institution’s Institutional Biosafety Committee.

    Be great,
    Drew

  • http://dasht-exp-1a.com Thomas Lord

    @drew — tomorrow. I’m tired and it’s late.

    -t

  • http://dasht-exp-1a.com Thomas Lord

    Hi Drew,

    With all due respect, your description of the
    history of computers and programming languages
    is wrong. You are correct that we’ve improved
    reliability over time by gaining experience.
    You are incorrect to deny that programming languages
    have always been based on specific semantic models
    and that executions of a program have always been
    objectively right or wrong according to whether or
    not the execution faithfully fits the semantic
    model.

    The analogy to synthetic biology remains poor.
    We did not discover the meaning of programming
    languages by trying out programs and seeing
    what they happened to do. We invented
    programming languages as notation for meanings
    we constructed and built.

    The concept I think you are missing is that of
    a design language. If you
    argue that synthetic biology is on the brink of,
    or is, or will one day, or ought to develop
    a design language you’ll be making a much
    more sensible statement.

    Design languages can be found in fields such as
    architecture and analog circuit design. Their
    application to analog circuits will be particularly
    fruitful here. See below.

    Like programming languages, design languages give
    us notations for concepts like abstraction,
    modularity, and composition. One can use a
    design language, in order to arrive at a design,
    by re-using work earlier done in the design
    language.

    Unlike programming languages, design languages
    only formally ever denote plans to build something,
    alongside records of facts and conjectures about
    the final plans as well as the parts that go into
    it.

    If I write a correct program to sort a list
    of numbers, any computer that does not run the program
    correctly has a bug. The meaning of the program
    is a mathematical certainty. It is a sorting algorithm
    even if the computer is malfunctioning.

    If I write a correct design for an analog
    circuit, other than for exceptionally simple circuits,
    that is just the starting point because we still
    don’t know what the circuit will actually do when
    correctly assembled in a defect-free way. In many
    cases we might be able to use calculation to predict
    that a designed circuit certainly does not have
    the meaning we thought — the meaning suggested by
    the design language. In many cases the only tractable
    way to find out what the design actually means
    is to build the circuit and measure its behavior.

    The reason that we have design languages, not programming
    languages, for analog circuits is because we know for
    sure that not only don’t we have a true semantic model
    for circuits, we know we almost certainly never will –
    not one that is useful for calculation, anyway. Analog
    circuit design of other than the most routine circuits
    will always be empirical. It can not be “programmed”
    in the same way that we can construct algorithms.

    Cells are like sufficiently interesting analog circuits this
    way: both are complex dynamic systems. Cells, of course, are
    far more difficult to contain and have the additional difference
    of a pesky habit of replicating themselves with mutations and
    variations. The computational intractability of cellular
    behavior should tell you that, as in analog circuits, a
    programming language is impossible. The pesky habits of cells
    should remind you that the stakes of casually glossing over
    these issues — the stakes raised in forgetting that the outcome
    of every experiment in synthesis is an empirical question –
    are very high stakes indeed.

    Once again: a program in a programming language tells you
    what the course of a correct execution of that program is.
    The meaning of a program tells you how a correctly behaving
    machine will function. A design, in a design language,
    suggests a construction and guesses (hopefully non-randomly) on
    what the construction will do. A design language is used to
    suggest experiments. A programming language is used to
    define a mathematical class.

    To the biology. I leveled the charge that genomics and
    synthetic biology are analogous to the very earliest days
    of modern chemistry (perhaps we have “corpuscles” but
    no theory of “valence” yet, so to speak). You reply:

    Again, I really do not agree with you here. There are large
    numbers (thousands) of genetic functions that execute reliably,
    and we understand these functions well enough to move them from
    one organism to the next, and do so every day with
    success. Fluorescent proteins, self assembling 50nm diameter gas
    impermeable protein shells, engineered RNA switches that
    implement Boolean logic, and so on. Moreover, there are
    conserved mechanics for DNA replication, mRNA transcription,
    protein translation, mRNA and protein degradation, et cetera.

    I think you make my point. Your use of the word “reliably”
    and your reference to “conserved mechanics” are particularly
    telling for the late-alchemy / early-chemistry analogy:

    What you describe as “reliable” is a set of recipes.
    What you neglect to mention is the large number of failed
    attempts to move these functions between organisms. It’s
    a “try it and see because hopefully protein X will be expressed
    and that will set off pathway Z” set of heuristics, not a semantic
    model.

    “Conserved mechanics” is an interesting phrase to bring
    to your defense. Not one I would have reached for. For the
    most part, this refers to molecular mechanics that are
    conserved across many (or even all) species. A typical usage
    might be “a few mechanisms for inter-cellular signaling are
    conserved in all species”. It is an empirical observation
    mostly about the particular species in the actual bio-sphere:
    an environment that historically has been subject to only
    quite narrow, constrained forms of perturbation to the
    set of all extant genomes.

    The problem is: nothing at all suggests that these
    conservation properties extend to the same system after
    energetic and extremely novel perturbations of the
    genome population.

    One thing that we do know is that cellular
    mechanics and metabolism are emergent properties
    of complex dynamic systems created out of a complex of
    feedback processes of which we have a very incomplete
    understanding (nevermind any chance, anytime soon,
    of modeling well enough to predict its behavior under
    unusual perturbations).

    Life on earth is an “attractor” in a chaotic system. We’ve
    evolved the combination of a biosphere and set of genomes
    which, yes, does “conserve” certain behaviors. And because this
    is an attractor we know for certain that there exist
    perturbations which break the conservation — as any synthbio
    student staring at the dead culture in his test tube can tell
    you.

    There is no particular reason to believe that the “conservation”
    – or the “reliable” expression of any given gene — is in any
    way encoded, anywhere. There is no reason to believe the
    stability usefully survives energetic perturbation at the
    level of the genome population. There is
    excellent reason to believe the opposite: the
    origin of life on earth and the course of evolution.

    You write:

    You are making sweeping and critical statements across several
    fields of science and engineering on the basis of limited and,
    as you described, unsuccessful interactions with two research
    groups.

    You are mistaken. I am making sweeping and critical statements
    across several professions on the basis of those interactions,
    yes, but also on the basis of surveying various literature,
    knowing quite a bit engineering, science, and complex dynamic
    systems from years of experience in computing, and now from
    seeing the arguments you bring to bear on these questions.
    I’m also informed by multiple histories of science and
    scientific thought, by anecdotes and experience in
    multi-disciplinary environments, by making my case on some of
    these matters to scientists in other fields, and, frankly, by
    common sense.

    You also misquote me here:

    You also state that the research environments you were exposed
    to did not follow required laboratory safety procedures or
    training. If so, I would ask that you report any issues to the
    relevant institution’s Institutional Biosafety Committee.

    You are badly mistaken. I claim no such thing. I am confident
    that both labs follow required procedures. I am confident that
    the anecdotal lab flushing toxics into the Charles River was
    following required procedures. What is at issue is whether
    or not those requirements are wisely designed.

    As you are hopefully aware, that is a question which
    Institutional Biosafety Committees are utterly unprepared to
    take up the questions of designing containment requirements.

    Containment requirements are set by the NIH and CDC,
    primarily on the basis of a consensus among a small number
    of biology researchers. Now, how would you suggest I
    set about influencing that process? I suggest
    it starts, in part, with discussions like this one.

    What part of “existential threat,” “green goo,”
    “energetic perturbation”, “complex dynamic system”,
    and “attractor” don’t you get?

    -t

  • https://www.vbi.vt.edu/faculty/personal_pages/jean_peccoud Jean Peccoud

    Thomas, Drew:

    with respect to a design language for Synthetic Biology, we have illustrated the concept in a recent bioinformatics paper: http://bioinformatics.oxfordjournals.org/cgi/content/full/23/20/2760

    We are actually in the process of creating various languages for various biological applications and organisms. This approach has been implemented in software. You can go to http://www.genocad.org to see how the notion of design language or syntactic model can be used to guide engineers in their design projects. This approach could also be used to validate constructs that have been designed in a different environment. For instance, the Registry could build a syntactic model of design standards and verify that the design of new constructs submitted to the Registry is consistent with the standard.

    Regarding the semantic aspect, it is true that we don’t have it yet but several groups are working on it just like we are. The difference between analog and digital circuits should be taken into consideration in the way we design these semantic models. May be that synbio will have to find a way to go digital. There would be clear benefits to this path but that would require methods to design molecules with user-defined activities that we just don’t have right now.