Previous  |  Next

Wed

Jan 30
2008

Everything you needed to know about human-created life forms but were afraid to ask

One of the great pleasures of being involved with O'Reilly Media is learning from the many fascinating people who get involved with the company on one level or another. They're Friends of O'Reilly, or Foos. We have occasional get-togethers with Foos, our own Nat Torkington has taken the concept to New Zealand, and we have one -- on the social graph (see coverage in the next Release 2.0) -- coming up this very weekend. While the Foo events are quite off-the-record, the work Foos do is very much public. So we'd like to share with you some of what we're learning. Over the next few days, we're going to use this blog to introduce you to one Foo in particular, synthetic biology pioneer Drew Endy. This multi-part profile of Drew and his work is, appropriately, written by another Foo, Quinn Norton, who will be talking about body hacking at ETech in March.--Jimmy Guterman


HelloWorld.jpg

Dr. Drew Endy tends to fidget. He motions frantically when he's trying to get something across. "It's hard because we've never made it simple," he explains with exasperation. Endy, a professor at MIT until the end of the school year (he's headed to Stanford), engineers new life forms. He's spent his life doing the hard work of bending the complexity of DNA to his will.

And he's determined to make it simple for you.

Drew Endy is a leading star in a field that's emerging to be the biggest thing since Walter Brooke suggested to Dustin Hoffman he should think about plastics. He's a synthetic biologist, a group of scientists and engineers that take microbes with familiar names like E. coli and yeast and make them do previously unimagined things.

Also: Dr. Endy explains what synthetic biology is. (mp3, 5.7mg)


Synthetic biology is next-generation biotech. Over the past 30 years, genetic engineering has laid the groundwork for what synthetic biology is and will be. All the important innovations of genetic engineering are put to work with newer techniques. What makes synthetic biology more than its predecessor is the ability to write DNA cheaply and easily. After designing a sequence, the genetic engineer can mail it to a vendor that will build the base pairs and overnight it back to them.

Sequencing, or reading out DNA, was once the purview of, at the very least, grad students. Now it can be accomplished by minimally trained unskilled labor. Will DNA writing go the same way DNA reading has gone? Probably not as much, but the price of synthesizing a base pair has lowered 16 fold in the last five years according to Endy.

Synthetic biology doesn't change the goals of biotech: medical applications, environmental remediation, biology based manufacture, etc. But it brings them closer, and adds more possibilities to the pile.

So what does a future of human-built biology look like? The obvious ideas are the ones researched now institutionally. It doesn't take much imagination to see that a great mover in this field will be pharmaceuticals, and the medical concerns that drive the healthcare industry. We will likely see progress towards biological agents for pollution remediation, drug manufacture, and nanomaterials. Many of these are not only in the works, but on the verge of entering the market. After that, a little imagination goes a long way.

The holy grail right now is alternative fuel production. Petroleum's supply and environmental problems might not dog an organism custom designed to get from the sun's energy to a liquid we can stick in our vehicles. If geneticists can produce a viable replacement for petroleum, there's a mint to be made even if the organism goes off patent in 20 years. J. Craig Venter's institute and his company, Synthetic Genomics, are particularly geared to this goal. The institute receives its research money from the U.S. Department of Energy as well as Synthetic Genomics. It's still a ways off. The institute has yet to complete its first fully synthetic organism of any kind, much less one that makes gas. With genes already modified to produce drugs, and the attention paid to fuel, a wide array of other products are candidates to grow instead of fabricate. Synthetic biology is very serious business.

Tremendous minds and piles of money are pouring into the potential organisms, and almost any one of them could easily payback that investment if successful. Drew Endy, despite is in-demand talents isn't part of any of that.

What makes Drew Endy's work unique in his field is what he wants to do with it, not the research itself. He wants to modularize DNA into something like a programming language. Then he wants to give it away.

Tomorrow: How can you make anyone a genetic engineer?

Wondering what that "Hello world" image is doing at the top of the post? It's synthetic biology in action: Students at the University of Texas re-engineered E. coli to be photosensitive, like photographic paper. Their first message? The programmer's traditional. It's published here courtesy of Jeff Tabor and Randy Rettberg.


tags:   | comments: 10   | Sphere It
submit:

 

0 TrackBacks

TrackBack URL for this entry: http://orm3.managed.sonic.net/mt/mt-tb.cgi/7466

Comments: 10

Thomas Lord [01.31.08 12:32 AM]

My comment here is more position statement than argument for a position.

Drew's analogies to programming languages are misleading, at best. Rather poor science abounds in genomics and synthetic biology. Oh, the chemistry and physics are fine and a few of the more complicated results are worked out nicely. But these elements are abused in the bulk of even the most reputable contemporary favorites in synthbio and genomics. This does not hold back the Industrialization, however, because the cooks are having alchemical successes. These are cooks, not philosophers. They can surf a "feels like a good idea" wave just fine -- it just isn't very good science.

That said:

The invisible hand is having a spasm. Fundamentally irrational trade, though optimal in some narrow perspective, is driving the exponential economic growth in these fields. That growth is why, of course, you can mail order custom DNA (or, if you are a real lab, just buy or build the machines that make it for yourself).

I say that this market condition is a "spasm" because it is an investment bubble. Nearly none of all this spending has shown any signs at all of paying for itself. At the same time, nearly all of it has risen the tide of risk on *all* investments.

I've met with and worked for two of the top labs in these fields. Mutually dissatisfying experiences in both cases, I think. Blame me, blame them, whatever you like. The point is, I got to poke around and talk to people, and ask what they're working on, and why, and observe how they work, and so forth. My position:

They have way too much money. They don't know what they are doing. They are obsessed with private funding. They organize all of their operational details around entirely fictional assumptions about the potential for valuable "intellectual property". They are in a bad way because their main competition right now is to raise more private money than the next guy and spend it faster and the science is laying by the wayside and the business bets are almost across the board -- sucker bets.

If I may use the dreaded "C" word, the capitalist class are idiots and dupes in this area.

Someone said "synthbio is the ultimate nanotech" or words to that effect. That is exactly true -- in ways we don't quite understand (yet). There is hard, important, science to be done -- it isn't getting done because these labs have too much money.

I propose a new focus. There is precisely one project in the synthbio and genomics fields that should be the focus of everyone engaged in the project: sustainable, environmentally conservative energy production. Set up some labs in various desolate deserts. Equip them with ubiquitous 24-hour web-cams. And work on energy production. Nothing else matters. It is a better investment for the investment class to pick their money up off the synthbio and genomics "IP" table and burn it than to keep going as they are. It is a better bet still for them to invest 50/50 in private energy delivery and public research into production.

-t

Drew Endy [01.31.08 05:48 AM]

Hi Thomas,


You started your comment with, "Drew's analogies to programming languages are misleading, at best." If you could provide an example of what you mean here, I would be grateful and could also attempt to respond. Otherwise, I don't know what to make of your comment.

Later on, you wrote, "They have way too much money. They don't know what they are doing. They are obsessed with private funding." We've not met. My lab's annual research budget is well under one million dollars per year, and the not-for-profit I've been boot-strapping for the last three years (www.biobricks.org) has an operating budget of under thirty thousand dollars per year. All our support comes from private donations or public research funds. We could smartly spend more money if we had it. I appreciate your observation that there is a decent amount of money being spent on bioenergy all of a sudden (whether or not this is actually something I would recognize as synthetic biology remains to be seen, but you have to start somewhere).

So far as your closing suggestion, I would suspect that many of the individuals you are criticizing in the abstract would argue that this is what they are trying to do. But, it is hard for me to know, because your comments are not specific.

Be great, Drew

Searchâ—† Engines Web [01.31.08 06:20 AM]

When one factors in the merciless trial and error process of evolution, that has taken billions of years to produce the balance life at the present - how does one respond to potential concerns about Murphy's law and the potential disastrous consequences from a few mistakes in judgment.

This is obviously the science of the future - however the concerns over the need for continuous checks and balances and complete transparency are not unmerited considering what potential dangers could ultimately occurr.

Alex Tolley [01.31.08 08:39 AM]

Having worked with biologists inserting and deleting genes in yeast, or even trying to grow mammalian cells in reactors, it is obvious that these phenotypes are not very stable and quickly diverge.

I'd be interested to know if the synthetic biology could be made more reproducible using the minimal genome organism approach, rather than using off-the-shelf organisms like e. coli and yeast.

A stable, living platform that acts as a more reliable operating system to host the genomic programs might offer a better path for the industrial developments being sought.

Thomas Lord [01.31.08 02:03 PM]

Hi Drew. Thank you for responding.

You asked about my position that "Drew's analogies to programming languages are misleading, at best." I'll answer in general terms and then with a (more or less) specific example.

A programming language has two "parts" of note, usually called a "semantics" and a "syntax". The semantics give a definite (not necessarily deterministic) mathematical model of a class of possible computations. Every program "means something" and what it means is given with precision by the semantics of the language. The syntax is, of course, a notation for humans. It is a way to express constructs in the semantic model.

One thing that means is that if I write a program, and give it to a computer to run, then what the computer does next will either be right or wrong. It either faithfully executes the program and produces exactly a behavior described by the semantic model, or the computer has a bug and runs my program incorrectly -- producing some behavior or output that is objectively wrong.

I hope you will agree that gene expression, gene interaction, genome mutation, metabolism, and environmental feedback are all examples of areas where we have only a feint grasp on the meaning of genes. There are some details we know cold but mostly we don't even know how to formulate the right questions yet. If there is analogy to the conventional history of science, perhaps it is to the earliest days of chemistry, where most practitioners were alchemists using fanciful theories to encode accidentally discovered pretty-repeatable recipes, while a few have moved on to discover conservation properties, have noticed some properties of oxidation, but have not quite yet worked out even valence.

And that is why the programming language analogy is misleading, at best: there is no true semantic model there for a programming language to use.

What that means is that when you make something that looks and smells like a synthbio programming language that it must, in fact, be something else. So what is it? Well, it is a true programming language in this sense: it is a way to share "programs" that describe lab procedures. Executing a hypothetical synthesis program literally means carrying out a series of steps like picking a host colony to modify, generating the plasmids, etc. But, that is economic programming, not gene programming. It is a standard for transactions that doesn't itself give us any insight as to the meaning of the organisms thus created.

A "programming language for X" where "X" is not a computation with well-modeled semantics is a tempting idea for many values of "X". I think when you look at what people mean, though, they mostly mean that they want recipes and commodities, not models of what they're actually doing. To a capitalist, a program is a program because you can hire programmer A to write one part, buy a different part from programmer B, and have programmer C combine them all with banal, mostly reliable reproducibility.

It seems to me that, in contrast, synthetic biology wants to be wisely deployed industrially with an attitude of extreme skepticism. That is, that something like a synthbio "programming language" reveals a recipe for a desired machine is no good reason to trust the programming language or therefore set about constructing it. Rather, because of the risks and rewards, *each* industrial-scale deployment deserves skeptical and adversarial examination from as many different angles as we can think of. (Hence, my suggestion to put up some labs deep in some deserts, equipping them with ubiquitous public surveillance.) In financing terms, this extreme skepticism might be justifiable as an appropriate tactic for mitigating correspondingly large risks.

On money: I found that I couldn't spit without hitting some student or researcher who was working on a scheme -- conversations often turned, for example, to who was able to get a meeting with which VCs. Students, in particular, often seemed more interested in making money on the margins of the science than by doing science (though my bias here is to have met mostly students with IT experience and interests). Nor could I do any work without spending more than 10% of the time discussing or worrying about researcher aspirations to develop "IP". Nor, where I was involved in the process, did I see money being carefully spent with focused purpose (a particularly uncomfortable situation for a vendor to find himself in because with poor focus from the customer, there is no good definition of satisfying the customer). I perceived a wide-spread attitude that all of this was just "normal" for university-based research. It is not. This is recent. I have begun to think it is generational.

There is cynicism and resignation on questions of risks. Nobody seems to be actually *measuring* the environmental impact of these labs. Controls over potentially hazardous materials are observably lax. A typical response from a student when queried about these kinds of things was to tell stories about how only a few years ago the protocols at his former school included discarding quite toxic materials down the direct drain to the nearby river -- oops.

Personally, I think the social interest may well lie in removing all of the governmentally granted IP protections and, at least domestically, mandating transparency and responding to what is discovered with regulation. Economic incentives should be on particular outcomes, most especially energy production.

Finally, you wrote:

So far as your closing suggestion, I would suspect that many of the individuals you are criticizing in the abstract would argue that this is what they are trying to do. But, it is hard for me to know, because your comments are not specific.

You refer, I assume, to my suggestion that investors either take to burning the money they'd otherwise spend on synthbio IP or split 50/50 into private power transmission and public energy production technology. (Perhaps I should have made it 33/33/33 adding in public/private conservation.)

The labs are distracted from science by money. The money is in pursuit of IP promises that are at best poorly secured by treaty and, anyway, high controversial: there is no reason to believe they will hold up. The form that industrialization is taking is driving too many researchers into the field. The form that the industrialization is taking is also exponentiating risk to large swaths of the biosphere. I see no sense in which this makes a good investment: it adds to the risk of every single other investment, significantly.

Public collaboration on energy production is an obvious priority, a great promise of the field, as far as I can tell, a way to put the focus back on science, and a way to organize investment to these aims.

I think you are right that this is what some of the field's leaders would argue they are trying to do. EBI, for example, talks of a "crash program" basically -- a new Manhatten or moon shot. I greatly respect that. They have the right sentiment and have created momentum. I'm saying: take it up a few notches.

-t

steve [01.31.08 06:41 PM]

Another great series of comments from Thomas Lord. Please, make him a front page poster.

Drew Endy [01.31.08 07:21 PM]

Hi Thomas,



You wrote, "One thing that means is that if I write a program, and give it to a computer to run, then what the computer does next will either be right or wrong. It either faithfully executes the program and produces exactly a behavior described by the semantic model, or the computer has a bug and runs my program incorrectly -- producing some behavior or output that is objectively wrong."



This may be true of computers now but was not true when computers were first constructed. Importantly, the path by which we learned how to engineer computers and software that work reliably was to build such systems.



Next, you wrote, "I hope you will agree that gene expression, gene interaction, genome mutation, metabolism, and environmental feedback are all examples of areas where we have only a feint grasp on the meaning of genes."



I strongly disagree with you here (see below).



"There are some details we know cold but mostly we don't even know how to formulate the right questions yet."



Again, I disagree. You've been talking to the wrong people, apparently.



"If there is analogy to the conventional history of science, perhaps it is to the earliest days of chemistry, where most practitioners were alchemists using fanciful theories to encode accidentally discovered pretty-repeatable recipes, while a few have moved on to discover conservation properties, have noticed some properties of oxidation, but have not quite yet worked out even valence. And that is why the programming language analogy is misleading, at best: there is no true semantic model there for a programming language to use."



Again, I really do not agree with you here. There are large numbers (thousands) of genetic functions that execute reliably, and we understand these functions well enough to move them from one organism to the next, and do so every day with success. Fluorescent proteins, self assembling 50nm diameter gas impermeable protein shells, engineered RNA switches that implement Boolean logic, and so on. Moreover, there are conserved mechanics for DNA replication, mRNA transcription, protein translation, mRNA and protein degradation, et cetera.



Yes, there are large gaps in our understanding of how natural biological systems work at the molecular scale (e.g., I would not claim that we have a complete physical model for how any natural cell fate selection system works); yes, there are also many engineering challenges to address, such as how to enable reliable functional composition across a collection of genetic functions collected from evolutionarily distant organisms. Such questions are well understood by the folks who are actually working to answer them.



You are making sweeping and critical statements across several fields of science and engineering on the basis of limited and, as you described, unsuccessful interactions with two research groups. You also state that the research environments you were exposed to did not follow required laboratory safety procedures or training. If so, I would ask that you report any issues to the relevant institution's Institutional Biosafety Committee.



Be great,
Drew

Thomas Lord [01.31.08 08:50 PM]

@drew -- tomorrow. I'm tired and it's late.

-t

Thomas Lord [02.01.08 02:52 PM]

Hi Drew,

With all due respect, your description of the history of computers and programming languages is wrong. You are correct that we've improved reliability over time by gaining experience. You are incorrect to deny that programming languages have always been based on specific semantic models and that executions of a program have always been objectively right or wrong according to whether or not the execution faithfully fits the semantic model.

The analogy to synthetic biology remains poor. We did not discover the meaning of programming languages by trying out programs and seeing what they happened to do. We invented programming languages as notation for meanings we constructed and built.

The concept I think you are missing is that of a design language. If you argue that synthetic biology is on the brink of, or is, or will one day, or ought to develop a design language you'll be making a much more sensible statement.

Design languages can be found in fields such as architecture and analog circuit design. Their application to analog circuits will be particularly fruitful here. See below.

Like programming languages, design languages give us notations for concepts like abstraction, modularity, and composition. One can use a design language, in order to arrive at a design, by re-using work earlier done in the design language.

Unlike programming languages, design languages only formally ever denote plans to build something, alongside records of facts and conjectures about the final plans as well as the parts that go into it.

If I write a correct program to sort a list of numbers, any computer that does not run the program correctly has a bug. The meaning of the program is a mathematical certainty. It is a sorting algorithm even if the computer is malfunctioning.

If I write a correct design for an analog circuit, other than for exceptionally simple circuits, that is just the starting point because we still don't know what the circuit will actually do when correctly assembled in a defect-free way. In many cases we might be able to use calculation to predict that a designed circuit certainly does not have the meaning we thought -- the meaning suggested by the design language. In many cases the only tractable way to find out what the design actually means is to build the circuit and measure its behavior.

The reason that we have design languages, not programming languages, for analog circuits is because we know for sure that not only don't we have a true semantic model for circuits, we know we almost certainly never will -- not one that is useful for calculation, anyway. Analog circuit design of other than the most routine circuits will always be empirical. It can not be "programmed" in the same way that we can construct algorithms.

Cells are like sufficiently interesting analog circuits this way: both are complex dynamic systems. Cells, of course, are far more difficult to contain and have the additional difference of a pesky habit of replicating themselves with mutations and variations. The computational intractability of cellular behavior should tell you that, as in analog circuits, a programming language is impossible. The pesky habits of cells should remind you that the stakes of casually glossing over these issues -- the stakes raised in forgetting that the outcome of every experiment in synthesis is an empirical question -- are very high stakes indeed.

Once again: a program in a programming language tells you what the course of a correct execution of that program is. The meaning of a program tells you how a correctly behaving machine will function. A design, in a design language, suggests a construction and guesses (hopefully non-randomly) on what the construction will do. A design language is used to suggest experiments. A programming language is used to define a mathematical class.

To the biology. I leveled the charge that genomics and synthetic biology are analogous to the very earliest days of modern chemistry (perhaps we have "corpuscles" but no theory of "valence" yet, so to speak). You reply:

Again, I really do not agree with you here. There are large numbers (thousands) of genetic functions that execute reliably, and we understand these functions well enough to move them from one organism to the next, and do so every day with success. Fluorescent proteins, self assembling 50nm diameter gas impermeable protein shells, engineered RNA switches that implement Boolean logic, and so on. Moreover, there are conserved mechanics for DNA replication, mRNA transcription, protein translation, mRNA and protein degradation, et cetera.

I think you make my point. Your use of the word "reliably" and your reference to "conserved mechanics" are particularly telling for the late-alchemy / early-chemistry analogy:

What you describe as "reliable" is a set of recipes. What you neglect to mention is the large number of failed attempts to move these functions between organisms. It's a "try it and see because hopefully protein X will be expressed and that will set off pathway Z" set of heuristics, not a semantic model.

"Conserved mechanics" is an interesting phrase to bring to your defense. Not one I would have reached for. For the most part, this refers to molecular mechanics that are conserved across many (or even all) species. A typical usage might be "a few mechanisms for inter-cellular signaling are conserved in all species". It is an empirical observation mostly about the particular species in the actual bio-sphere: an environment that historically has been subject to only quite narrow, constrained forms of perturbation to the set of all extant genomes.

The problem is: nothing at all suggests that these conservation properties extend to the same system after energetic and extremely novel perturbations of the genome population.

One thing that we do know is that cellular mechanics and metabolism are emergent properties of complex dynamic systems created out of a complex of feedback processes of which we have a very incomplete understanding (nevermind any chance, anytime soon, of modeling well enough to predict its behavior under unusual perturbations).

Life on earth is an "attractor" in a chaotic system. We've evolved the combination of a biosphere and set of genomes which, yes, does "conserve" certain behaviors. And because this is an attractor we know for certain that there exist perturbations which break the conservation -- as any synthbio student staring at the dead culture in his test tube can tell you.

There is no particular reason to believe that the "conservation" -- or the "reliable" expression of any given gene -- is in any way encoded, anywhere. There is no reason to believe the stability usefully survives energetic perturbation at the level of the genome population. There is excellent reason to believe the opposite: the origin of life on earth and the course of evolution.

You write:

You are making sweeping and critical statements across several fields of science and engineering on the basis of limited and, as you described, unsuccessful interactions with two research groups.

You are mistaken. I am making sweeping and critical statements across several professions on the basis of those interactions, yes, but also on the basis of surveying various literature, knowing quite a bit engineering, science, and complex dynamic systems from years of experience in computing, and now from seeing the arguments you bring to bear on these questions. I'm also informed by multiple histories of science and scientific thought, by anecdotes and experience in multi-disciplinary environments, by making my case on some of these matters to scientists in other fields, and, frankly, by common sense.

You also misquote me here:

You also state that the research environments you were exposed to did not follow required laboratory safety procedures or training. If so, I would ask that you report any issues to the relevant institution's Institutional Biosafety Committee.

You are badly mistaken. I claim no such thing. I am confident that both labs follow required procedures. I am confident that the anecdotal lab flushing toxics into the Charles River was following required procedures. What is at issue is whether or not those requirements are wisely designed.

As you are hopefully aware, that is a question which Institutional Biosafety Committees are utterly unprepared to take up the questions of designing containment requirements.

Containment requirements are set by the NIH and CDC, primarily on the basis of a consensus among a small number of biology researchers. Now, how would you suggest I set about influencing that process? I suggest it starts, in part, with discussions like this one.

What part of "existential threat," "green goo," "energetic perturbation", "complex dynamic system", and "attractor" don't you get?

-t

Jean Peccoud [02.04.08 02:26 PM]

Thomas, Drew:

with respect to a design language for Synthetic Biology, we have illustrated the concept in a recent bioinformatics paper: http://bioinformatics.oxfordjournals.org/cgi/content/full/23/20/2760

We are actually in the process of creating various languages for various biological applications and organisms. This approach has been implemented in software. You can go to www.genocad.org to see how the notion of design language or syntactic model can be used to guide engineers in their design projects. This approach could also be used to validate constructs that have been designed in a different environment. For instance, the Registry could build a syntactic model of design standards and verify that the design of new constructs submitted to the Registry is consistent with the standard.

Regarding the semantic aspect, it is true that we don't have it yet but several groups are working on it just like we are. The difference between analog and digital circuits should be taken into consideration in the way we design these semantic models. May be that synbio will have to find a way to go digital. There would be clear benefits to this path but that would require methods to design molecules with user-defined activities that we just don't have right now.

Post A Comment:

 (please be patient, comments may take awhile to post)






Type the characters you see in the picture above.

Subscribe to this Site

Radar RSS feed

Print Page

RELEASE 2.0 BACK ISSUES

BUSINESS INTELLIGENCE

CURRENT CONFERENCES