## Sunday, February 28, 2010

### Jukes-Cantor (5)

I am trying to see how the equations in the Jukes-Cantor model of sequence evolution work, and then eventually, extend this to other models. In order to test my understanding, I'll want to work out some practical examples. But I'm not there yet.

What I want to do here is to wrap up something from the first post. There we had two differential equations for the rate of change of a particular nucleotide position:

 `d/dt(PXX(t)) = -3*α*e-4αtd/dt(PXY(t)) = α*e-4αt`

And we'd like to express these results in terms of `PXX(t) and PXY(t)`:

 `PXX(t) = 1/4 + 3/4*e-4αtPXY(t) = 1/4 - 1/4*e-4αt`

Taking the first one, we have

 `PXX(t) = 1/4 + 3/4*e-4αt3*e-4αt = 4*(PXX(t) - 1/4)-3*α*e-4αt = -4*α*(PXX(t) - 1/4)d/dt(PXX(t)) = α - 4*α*PXX(t)`

And for the second

 `PXY(t) = 1/4 - 1/4*e-4αte-4αt = 1 - 4*PXY(t))α*e-4αt = α - 4*α*PXY(t))d/dt(PXY(t)) = α - 4*α*PXY(t)`

So the slopes are proportional to the probabilities, with an extra term. But the most interesting thing is that the form is the same for both `PXX` and `PXY`!

I wasn't expecting this but it makes sense, because at long times we come to equilibrium (the stationary distribution of the Markov chain), and all rates are the same. At time-zero we have `PXX = 1` and the rate is `-3*α`, while `PXY = 0` and the rate is `α`. I think it's OK.