November 18, 2005

Why Math in Synthetic Bio?

In the past 3 years, I've developed some very advanced methods for computing the stochastic dynamical behavior of "small" physical or chemical systems, which especially includes biological systems. The methods are an improvement over existing ones, decreasing the amount of computational time by 1000x or more for some systems, but still retaining accuracy. Now, I'm working on more methods to better predict the long time behavior of biological systems.

But why is using math important in synthetic biology?

The idea is to develop an accurate model of a biological system (usually focusing on a small subsystem of a single cell) and then predict what the dynamics will be over time. If you can predict what the system will do before you build it, you save yourself both time and money. The model should be a "first principles" one based on the molecular interactions of each DNA site, protein, RNA, etc molecule in the system. That way, if you know the interactions of a DNA site in one model, then you should be able to put the same DNA site in a different model and still predict what will happen. (No lumped interactions!) Of course, we're still constrained by the limited amount of information we have on molecular "parts". That's ok for now, because (one day) we should have that information. Until then, we will need to be good engineers and make guesses (yes, guesses) on what those interactions might be and how they affect the system dynamics.

You would be surprised as to how much guessing goes into making 70 story buildings, cars that move at 120mph, and lots of other contraptions that will easily kill you if built incorrectly. The "engineer guess" is making sure that if you're 500% wrong that nothing bad will ever happen. The technical term is robustness. But, in practice, you assume the unknown quantity can take values within a very large range and then you make sure that nothing breaks for any value in that range. Of course, you have to pick which quantities to make your design robust to. That's where you get this tradeoff between "robustness" and "fragility".

But we should be rigorous about our "guessing". We should be able to identify _all_ possible behaviors that exist when varying the value of a specific quantity. What might the quantity be, as related to synthetic biology? It could be the binding affinity of a protein to a DNA site. It could be the enzymatic Kcat of a phosphorylation reaction. It could be the influx of a regulatory protein from the extracellular space. It is any parameter in our model that is not entirely known.

The math behind computing _all_ possible behaviors of a system while varying one or more parameters is called bifurcation analysis. The subject has always interested me and it's actually extremely useful in real life. Computing that your reactor has a subcritical Hopf bifurcation at a critical parameter value tells you that if your parameter is past this point, your reactor will suddenly blow up and kill lots of people. Whoever said math wasn't useful? In practice, they make sure the parameter never goes near that critical value..not even remotely near it. So reactors generally don't blow up. Whew, that's good to know.

How is bifurcation analysis related to synthetic biology? Say you wanted to create a gene therapy system that consisted of a biosensor + regulated production of a therapeutic protein. There might be 120 parameters in your system. You might have good information about 60 of those, so-so information about 30, and the rest...who knows. But you want the gene therapy to work no matter what. Even if you incorrectly measured the interaction between molecule A and B. Even if you get a mutation that changes an interaction between molecule C and D. It just has to work. If it doesn't, someone can die. So can you determine the behavior of the model over all unknown parameters and make sure that the system will never break? If the number of unknown parameters is 30, well ... that's a 30 dimensional space to worry about. Mathematically, you could do it...but it would take a while. Are there ways to speed up the process? Absolutely. (I won't go into details here.) But my main point is that bifurcation analysis is extremely important for designing synthetic biological systems.

But, wait, I did say "stochastic dynamics" and the bifurcation analysis of stochastic systems is ... not well developed yet. That's because when you're working with probability distributions, the idea of a "qualitative change" in the solution gets harder to define. And working with these types of systems is harder in general. So that is what I am currently doing. And it is going well. :)

The math can get very heady and I worry that people who lack the background will become turned off by it. But the final product is very useful: You can find all possible behaviors of a "small" physical or chemical system (such as biological system) using a combination of existing and new simulation techniques. The words "all possible" are extremely important. As it turns out, if your system has two stable states, a "good" one and a "bad" one, then, because of the random nature of the interactions, the dynamics might first go to the "good" state, but later go to the "bad" one. Not good. But you can minimize the "escape" from the good to bad state if you design the system well enough. This type of "escape" doesn't happen if you describe the system using deterministic dynamics, but it happens in real life. (One more reason why stochastic descriptions are important.)

If you've read this far, then my guess is that you're somewhat interested in mathematics. Study it! (Especially non-linear dynamics, bifurcation analysis, and stochastic processes.) They are useful for real life applications, especially in new fields such as synthetic biology. If you're looking for a book on non-linear dynamics I would suggest the one by Steven Strogatz. The material is at the intermediate-advanced undergraduate level with lots of pictures.

So to answer my initial question: "Why Math in Synthetic Biology?". Math is needed because there's no way to build large, complex dynamical systems without first understanding (and then predicting) what those dynamics will be. Math then gives us the tools to gaurantee certain types of behaviors even if we don't exactly know all of the parameters of a model of the system. Also, using math generates testable and precisely quantitative predictions about the biological system of interest.

(This post is very hodgepodge and not very technical. For technical details on the bifurcation analysis of stochastic systems, you'll just have to wait for the paper. In the mean time, there's two papers on the stochastic numerical methods that are 1000x+ faster than the original one. My group also published a paper on the design principles behind an oscillating gene network. If you search PubMed for 'Salis H', they should come up.)

Posted by sali0090 at November 18, 2005 11:59 AM