Tuesday, December 7, 2010

Models of sequence evolution 2

We have an equation to solve:

P = eQt

It looks easy enough, except that here Q is a 4 x 4 matrix. But linear algebra teaches us that if we can find the eigenvalues and eigenvectors of Q, we can form a succession of 3 matrices such that

Q = U Λ U-1

where &Lambda is a diagonal matrix. The upshot of that is that any function we want to do on Q is equal to:

f(Q) = U f(Λ) U-1

U and U-1 are the right and left eigenvectors of Q, and moreover as the notation implies:

U U-1 = I

So, let's see where that takes us. Long ago, we had this matrix:

A = [ 5  2 ]
[ 9 2 ]

Using the characteristic equation we found eigenvalues

λ = [8,-1] 

We also found eigenvectors v, one for each eigenvalue, such that:

Av = λv
[ 5 2 ] [ x ] = λ [ x ]
[ 9 2 ] [ y ] [ y ]

For example:

[ 5  2 ] [ 2 ] = 8 [ 2 ]
[ 9 2 ] [ 3 ] [ 3 ]

Checking our multiplication, rows x columns:

[ 16 ] = [ 16 ]
[ 24 ] [ 24 ]

We scale v to be a unit vector:

[ 2/√13 ] = [ 0.5547 ]
[ 3/√13 ] [ 0.8321 ]

And the second eigenvector is:

[  1/√10 ] = [  0.3162 ]
[ -3/√10 ] [ -0.9487 ]


In R:


M = matrix(c(5,2,9,2),nrow=2,byrow=T)
eigen(M)
$values
[1] 8 -1

$vectors
[,1] [,2]
[1,] 0.5547002 -0.3162278
[2,] 0.8320503 0.9486833


Any linear multiple of v is still an eigenvector, so it's OK that R has our v2 x -1.

So far, so good. Now, Yang says that what we need to do is "decompose" our matrix into right and left eigenvectors. We have a right eigenvector, but now we need a new left eigenvector for each eigenvalue such that:

vA = vλ = λv

Luckily, we can use inverses. He says that we want:

Q = U Λ U-1

Where U is the right eigenvector of Q and the "rows of " U-1 are the left eigenvector of Q. &Lambda is a diagonal vector with the eigenvalues λ. Here is the code separately (makes it easier for you to run it):

result = eigen(M)
U = as.matrix(result$vectors,nrow=2)
iU = solve(U)
U %*% iU
L = diag(result$values)
L
U %*% L %*% iU

and what it prints:

> result = eigen(M)
> U = as.matrix(result$vectors,nrow=2)
> iU = solve(U)
> U %*% iU
[,1] [,2]
[1,] 1 0
[2,] 0 1
> L = diag(result$values)
> L
[,1] [,2]
[1,] 8 0
[2,] 0 -1
> U %*% L %*% iU
[,1] [,2]
[1,] 5 2
[2,] 9 2

I would just note in passing that although svd sounds like what we want (singular value decomposition), it doesn't give the values we're looking for. I'm not sure why yet.

Let's try a phylogenetics example using a simple matrix like Jukes-Cantor's Q:

a = 0.01/3
m = c(-3*a, a, a, a,
a, -3*a, a, a,
a, a, -3*a, a,
a, a, a, -3*a)
Q = matrix(m,nrow=4)



result = eigen(Q)
U = as.matrix(result$vectors,nrow=4)
U
iU = solve(U)
iU
U %*% iU
L = diag(result$values)
L
U %*% L %*% iU



> result = eigen(Q)
> U = as.matrix(result$vectors,nrow=4)
> U
[,1] [,2] [,3] [,4]
[1,] -0.5 0.8660254 0.0000000 0.0000000
[2,] -0.5 -0.2886751 -0.1601282 0.8006408
[3,] -0.5 -0.2886751 -0.6133112 -0.5389954
[4,] -0.5 -0.2886751 0.7734393 -0.2616453
> iU = solve(U)
> iU
[,1] [,2] [,3] [,4]
[1,] -5.000000e-01 -0.5000000 -0.5000000 -0.5000000
[2,] 8.660254e-01 -0.2886751 -0.2886751 -0.2886751
[3,] -7.698012e-18 -0.1601282 -0.6133112 0.7734393
[4,] 3.849006e-17 0.8006408 -0.5389954 -0.2616453
> U %*% iU
[,1] [,2] [,3] [,4]
[1,] 1.000000e+00 1.110223e-16 -5.551115e-17 1.110223e-16
[2,] 4.293802e-18 1.000000e+00 0.000000e+00 0.000000e+00
[3,] 1.173089e-17 0.000000e+00 1.000000e+00 0.000000e+00
[4,] 1.173089e-17 -2.775558e-17 2.775558e-17 1.000000e+00
> L = diag(result$values)
> L
[,1] [,2] [,3] [,4]
[1,] -1.734723e-18 0.00000000 0.00000000 0.00000000
[2,] 0.000000e+00 -0.01333333 0.00000000 0.00000000
[3,] 0.000000e+00 0.00000000 -0.01333333 0.00000000
[4,] 0.000000e+00 0.00000000 0.00000000 -0.01333333
> U %*% L %*% iU
[,1] [,2] [,3] [,4]
[1,] -0.010000000 0.003333333 0.003333333 0.003333333
[2,] 0.003333333 -0.010000000 0.003333333 0.003333333
[3,] 0.003333333 0.003333333 -0.010000000 0.003333333
[4,] 0.003333333 0.003333333 0.003333333 -0.010000000

Check.

Now, what we need to do is to exponentiate L. That is, we do exp { λt } for each eigenvalue, and place the result in a diagonal matrix. The code:

t = 0.1
result$values
k = exp(t*result$values)
D = diag(k)
P = U %*% D %*% iU
P

M = P
for (i in 1:5000) { M = P %*% M }
M


It prints:

> t = 0.1
> result$values
[1] -1.734723e-18 -1.333333e-02 -1.333333e-02 -1.333333e-02
> k = exp(t*result$values)
> D = diag(k)
> P = U %*% D %*% iU
> P
[,1] [,2] [,3] [,4]
[1,] 0.9990006664 0.0003331112 0.0003331112 0.0003331112
[2,] 0.0003331112 0.9990006664 0.0003331112 0.0003331112
[3,] 0.0003331112 0.0003331112 0.9990006664 0.0003331112
[4,] 0.0003331112 0.0003331112 0.0003331112 0.9990006664
>
> M = P
> for (i in 1:5000) { M = P %*% M }
> M
[,1] [,2] [,3] [,4]
[1,] 0.2509532 0.2496823 0.2496823 0.2496823
[2,] 0.2496823 0.2509532 0.2496823 0.2496823
[3,] 0.2496823 0.2496823 0.2509532 0.2496823
[4,] 0.2496823 0.2496823 0.2496823 0.2509532


Looks good. How about doing it in Python?

Output:

Q 
[[-0.01 0.00333333 0.00333333 0.00333333]
[ 0.00333333 -0.01 0.00333333 0.00333333]
[ 0.00333333 0.00333333 -0.01 0.00333333]
[ 0.00333333 0.00333333 0.00333333 -0.01 ]]
evals
[ -1.33333333e-02 4.33680869e-19 -1.33333333e-02 -1.33333333e-02]
L
[[ 0.99866756 0. 0. 0. ]
[ 0. 1. 0. 0. ]
[ 0. 0. 0.99866756 0. ]
[ 0. 0. 0. 0.99866756]]
U
[[-0.8660254 0.5 -0.13387887 -0.2397665 ]
[ 0.28867513 0.5 0.80991131 0.14009673]
[ 0.28867513 0.5 -0.11709746 -0.62763021]
[ 0.28867513 0.5 -0.55893497 0.72729998]]
iU
[[-0.8660254 0.08069103 0.58314599 0.20218839]
[ 0.5 0.5 0.5 0.5 ]
[-0. 0.84935676 -0.36809649 -0.48126026]
[-0. 0.27697196 -0.85808026 0.58110829]]
P
[[ 9.99000666e-01 3.33111210e-04 3.33111210e-04 3.33111210e-04]
[ 3.33111210e-04 9.99000666e-01 3.33111210e-04 3.33111210e-04]
[ 3.33111210e-04 3.33111210e-04 9.99000666e-01 3.33111210e-04]
[ 3.33111210e-04 3.33111210e-04 3.33111210e-04 9.99000666e-01]]
M
[[ 0.2509532 0.24968227 0.24968227 0.24968227]
[ 0.24968227 0.2509532 0.24968227 0.24968227]
[ 0.24968227 0.24968227 0.2509532 0.24968227]
[ 0.24968227 0.24968227 0.24968227 0.2509532 ]]


Code listing:


from __future__ import division
import math
import numpy as np

a = 0.01/3
Q = np.asarray(
[ -3*a, a, a, a,
a, -3*a, a, a,
a, a, -3*a, a,
a, a, a, -3*a ])
Q.shape = (4,4)
print 'Q', '\n', Q

evals,evecs = np.linalg.eig(Q)
print 'evals', '\n', evals
t = 0.1
L = np.diag([math.e**(k*t) for k in evals])
print 'L', '\n', L
U = evecs
print 'U', '\n', U
iU = np.linalg.inv(U)
print 'iU', '\n', iU
P = np.dot(U,np.dot(L,iU))
print 'P', '\n', P
M = P
for i in range(5000):
M = np.dot(P,M)
print 'M', '\n', M

3 comments:

Edson Ishengoma said...

What does scaling v to be a unit vector mean?

Edson Ishengoma said...

What does "scale v to be a unit vector" mean?

telliott99 said...

Any vector (except the 0 vector) can be re-scaled so that it has length 1. See: http://en.wikipedia.org/wiki/Unit_vector. You divide each component of the vector by the length of the vector.