Python for Bioinformatics: Strang's Linear Algebra

Friday, March 5, 2010

Strang's Linear Algebra

I've been working on linear algebra. I bought Gilbert Strang's book, and I've also been watching the videos of his lectures that are on MIT's ocw (link page). As you'll know if you read my earlier posts about calculus, I think he is just terrific, one of the best teachers I've ever seen. My goal is to understand the subject well enough to really comprehend where eigenvalues and eigenvectors come from, and why they do what they do. Plus a few more things, like singular value decomposition.

I don't plan to post about all of that here. Just go watch the video!

But he explains matrix "invertibility" and its connection to systems of equations in a way that is just beautiful. Here is his example. Suppose we have:

 x - 2y =  1
3x + 2y = 11

There are many ways to solve this. We could just add the two equations, cancel the y term, solve for x ( = 3), and then get y by back-substitution ( = 1). Or, we might use the elimination method, and subtract 3 times the first equation from the second one, giving:

 x - 2y = 1
     8y = 8

      y = 1

Now we have y from the second equation and can back-substitute to get x.

We could also graph the lines and find the point where they cross. That point solves both equations.

But the coolest way is this. Rewrite the equations in matrix / vector form:

A x = b

(The duplication of symbols is unfortunate, here x is a vector, while x is a scalar. If this bothers you just substitute x₁ for x and x₂ for y).

A x = b
[ 1  -2 ] [ x ]  = [  1 ]
[ 3   2 ] [ y ]    [ 11 ]

We can think of this as follows: x and y are scalars, constants to be chosen such that this linear combination of the columns of A gives b:

x [ 1 ] + y [ -2 ]  =  [  1 ]
  [ 3 ]     [  2 ]     [ 11 ]

3 [ 1 ] + 1 [ -2 ]  =  [  1 ]
  [ 3 ]     [  2 ]     [ 11 ]

And, since we already solved the system (above) we can see that 3 of column one of A plus 1 of column two of A equals b.

These columns can be seen as vectors on the x,y plane! We go out 3 in the +x direction and then 9 in the +y direction; in phase two we come back 2 in the x direction but up 2 more in the +y direction, and we arrive at b.

The best part about this vector picture using the columns of A is that I can understand what is happening when a system of equations does not have a solution, and why that's equivalent to the matrix A not having an inverse, and also to there existing non-zero x and/or y (i.e. non-zero x) such that A x = 0.

Suppose that there did exist non-zero x such that:

[ 1  -2 ] [ x ]  = [ 0 ]
[ 3   2 ] [ y ]    [ 0 ]

In fact, let's change the matrix A so that there is such an x and y:

[ 1  -2 ] [ x ]  = [ 0 ]
[ 3  -6 ] [ y ]    [ 0 ]

What this says is that having moved out

x [ 1 ]
  [ 3 ]

then we can still get back to

[ 0 ]
[ 0 ]

by adding some amount of

[ -2 ]
[ -6 ]

The only way this can happen is if the two columns point in the same direction (or exactly opposite) in 2D space. They must be on the same line going through 0,0.

All the rest of 2D space is unreachable using linear combinations of these columns, and that's why most b's do not have a solution x given this A. You can't get there from here. Furthermore, it explains why A does not have an inverse. Suppose there is a non-zero x such that

A x = 0

and A^-1 exists, such that

A^-1 A = I

That would mean:

A^-1 A x = A^-1 0
A^-1 A x = 0
I x = 0

which is impossible unless x is the zero vector.

This extends naturally to 3D space.

If the three columns of a 3 x 3 matrix A, seen as vectors, all lie in the same plane (say, the xz-plane), then there is no way to reach a point that does not have y = 0, that is not already in the xz-plane!. Usually two of the vectors would be sufficient to reach every point in the plane (by some linear combination), and that would mean that the third is a linear combination of the first two, so they are not independent.

I think this is just wonderful. Thanks, Professor Strang.