## Thursday, July 9, 2009

### The eigenvalue problem

In learning Bioinformatics, I've come across matrices and matrix operations repeatedly. For example, eigenvectors and eigenvalues are at the heart of principal component analysis. Since I quit math before I got that far, I've been trying to catch up. I'm using a simple linear algebra textbook (McMahon).

In keeping with the spirit I've described, I'll post my homework here.

Suppose we have a 2 x 2 matrix A (i.e. of order 2):

 `A = [ 5 2 ] [ 9 2 ]`

The characteristic polynomial of A is given by:

 `det|A - kI|`

I is the identity matrix of order 2:

 `I = [ 1 0 ] [ 0 1 ]`

And k is a variable (I'm using k instead of λ, the traditional symbol).

 `kI = [ k 0 ] [ 0 k ]A - kI = [ 5-k 2 ] [ 9 2-k ]`

Remembering how to evaluate the determinant of a 2 x 2 matrix:

 `M = [ a b ] [ c d ]det|M| = (a * d) - (b * c)`

so:

 `det|A| = 5 * 2 - 9 * 2 = -8`

to get the characteristic equation, set:

 `det |A - kI| = 0A - kI = [ 5-k 2 ] [ 9 2-k ]det |A - kI| = = (5 - k)(2 - k) - 18 = 10 - 7k + k^2 - 18 = k^2 - 7k - 8 = (k - 8)(k + 1) `

The solutions for k are then 8 and -1. These are the eigenvalues of A. For some reason, which I don't know, it is also true that:

 `A^2 - 7A - 8I = 0`

(I'm using ^ here in the code to indicate a superscript). So we can check our math:

 `A = [ 5 2 ] [ 9 2 ]A^2 = [ 5 2 ] * [ 5 2 ] = [ 43 14 ] [ 9 2 ] [ 9 2 ] [ 63 22 ]A^2 - 7A - 8I = [ 43 14 ] - [ 35 14 ] - [ 8 0 ] = 0 [ 63 22 ] [ 63 14 ] [ 0 8 ] `

The eigenvectors of A satisfy this equation:

 `Av = kv `

where v is a vector with values x and y:

 `v = [ x ] [ y ]`

So for the first eigenvalue we have:

 `A v1 = k1 v1 [ 5 2 ] [ x ] = 8 [ x ] [ 9 2 ] [ y ] = [ y ]`

That is:

 `5x + 2y = 8x => y = 3/2 x`

and

 `9x + 2y = 8y => y = 3/2 x`

We can choose x = 2, then y = 3 and

 `v1 = [ 2 ] [ 3 ]`

Check our math:

 `A v1 = k1 v1 [ 5 2 ] [ 2 ] = 8 [ 2 ] [ 9 2 ] [ 3 ] = [ 3 ] [ 16 ] = [ 16 ] [ 24 ] = [ 24 ]`

Similarly, for the second eigenvalue we have:

 ` A v2 = k2 v1 [ 5 2 ] [ x ] = -1 [ x ] [ 9 2 ] [ y ] = [ y ]`

That is:
 ` 5x + 2y = -x => y = -3x`

and
 ` 9x + 2y = -y => y = -3x`

Choose x = 1, then y = -3

 ` v2 = [ 1 ] [ -3 ]`

Check the math:

 ` A v2 = k2 v2 [ 5 2 ] [ 1 ] = -1 [ 1 ] [ 9 2 ] [ -3 ] = [ -3 ] [ -1 ] = [ -1 ] [ 3 ] = [ 3 ]`

The eigenvectors are usually ordered according to the magnitude of the corresponding eigenvalue. That's why 8 is the first eigenvalue. The eigenvectors can be normalized to have length = 1. Here, the length of v1:

 ` v1 = [ 2 ] [ 3 ]`

is

 `sqrt (2^2 + 3^2) = sqrt(13)`

so, the normalized v1 is:

 ` v1 = [ 2/sqrt(13) ] = [ 0.554 ] [ 3/sqrt(13) ] [ 0.832 ]`

The trace of a vector A is the sum along the diagonal

 `tr(A) = 5 + 2 = 7`

The determinant of A is (above) = -8

The trace of a vector is also equal to the sum of
the eigenvalues (sum of k's).

The determinant of a vector is the product of the k's. Since k1 = 8 and k2 = -1, we can verify the relationships are correct for matrix A.

In R:

 `v = c(5,2,9,2)A = matrix(v,nrow=2,byrow=T)A [,1] [,2][1,] 5 2[2,] 9 2eigen(A)\$values[1] 8 -1\$vectors [,1] [,2][1,] 0.5547002 -0.3162278[2,] 0.8320503 0.9486833`

Why do eigenvalues and eigenvectors matter to us? It turns out that if we have an n x p matrix like

 `B = [ x1 y1 ] [ x2 y2 ] [ x3 y3 ] [ x4 y4 ]`

where the x and y values are separately normalized. (They are z-scores, obtained by subtracting the mean for each column and then dividing by its standard deviation)

We construct a covariance matrix:

 `C = x yx [ 1.00 z ]y [ z 1.00 ]`

where

 `z = 1/N * [ the sum over i of (zxi * zyi) ]`

Then the eigenvalues and eigenvectors of C identify the principal components of B.

[UPDATE: This last part is not correct. I confused the correlation and covariance matrix. See here for details.]