## Thursday, February 10, 2011

### Strang's Linear Algebra: homework (recap)

I've been working on linear algebra again. My goal is to understand how eigenvalues and eigenvectors work, but there is also a beauty to the subject that I find fascinating. It's at once geometrical, but scales easily to n-dimensions. The basic facts are (so far) almost trivial geometrically, but writing vector and matrix equations is still tricky.

Although I posted about projections and least squares error almost a year ago (here and here), I had some trouble with the formulas and example for projections onto subspaces and least squares. So this is a recap.

The first example is 2D. You can refer to the figure from the book, above. We have two vectors a and b, where b points away from a, and we seek the projection p of b onto a:

 ` p = xa `

(where x should really have a hat). It's a number which multiplies a to put it at exactly the correct position p so that

 ` p + e = b e = b - p e is perpendicular to a (and p) `

and the dot product (which Prof. Strang does as aT e) is zero:

 ` aT e = 0 = aT (b - p) = aT(b - xa). `

For the 2D problem both terms are numbers, so we can divide in rearranging to give:

 ` x = aTb / aTa p = xa = (aTb / aTa) a`

In the example we have b = (3,4,4) and a = (2,2,1) so aTb is just the dot product of a • b and

 ` x = (2,2,1) • (3,4,4) / (2,2,1) • (2,2,1) = 18/9 = 2 p = xa = 2 (2,2,1) = (4,4,2) e = b - p = (3,4,4) - (4,4,2) = (-1,0,2)`

and

 ` e • a = (-1,0,2) • (2,2,1) = 0 `

as required.

Now we add a second vector to the problem:

 ` a* = (1,0,0). `

Then we have the plane formed by

 ` a = (2,2,1) and a* = (1,0,0) `

and still, b (3,4,4) is not in the plane. We seek the projection of b onto this plane. Note that the error which is

 ` e = b - p`

as before, is perpendicular to every vector in the plane. (It is some z n where n is the normal vector).

So what Strang does is say that we have a matrix A:

 ` A = [ a a* ] = [ 2 1 ] [ 2 0 ] [ 1 0 ]`

Putting the x on the right side of A, we have:

 ` p = Ax e = b - p`

p is some combination of the independent columns of A, where x is a vector now.

The critical insight is that we can combine the individual dot products

 ` a • e = 0 a* • e = 0`

into a matrix multiplication:

 ` ATe = 0 = ATb - ATp = ATb - ATAx`

Continuing:

 ` ATb = ATAx`

We can't just divide like before, but instead we use the inverse:

 ` x = (ATA)-1 ATb`

Now, since we have x on the right of A, to get p we do:

 ` p = Ax = A (ATA)-1 AT b`

We can re-group the terms as `A (ATA)-1 AT` and call this a projection matrix P that multiplies b to give p:

 ` A (ATA)-1 AT `

There is a temptation to say:

 ` (ATA)-1 = A-1 (AT)-1`

and then everything in the previous equation cancels. But here's the thing!

 ` A = [ a a* ] = [ 2 1 ] [ 2 0 ] [ 1 0 ]`

A is not invertible! So we can't find A-1. And supposing A were invertible, and we could do that multiplication and then get the identity matrix. The projection of b into the column space of A would be equal to b. Multiplying by `I` would be exactly right!

So, this is the crazy formula that we have to compute:

 ` p = Ax = A (ATA)-1 AT b`

The first part is easy:

 `AT A = [ 2 1 ] [ 2 0 ] [ 1 0 ] [ 2 2 1 ] [ 9 2 ] [ 1 0 0 ] [ 2 1 ]`

I like to put the second matrix above the result. Then it's easy to remember the rows and columns.

Use the 2 x 2 trick to get the inverse (switch a and d, multiply b and c by -1, and divide everything by the determinant); then continuing with A (ATA)-1

 ` [ 1/5 -2/5 ] [ -2/5 9/5 ] [ 2 1 ] [ 0 1 ] [ 2 0 ] [ 2/5 -4/5 ] [ 1 0 ] [ 1/5 -2/5 ]`

And finally, multiplying by AT to get P

 ` [ 2 2 1 ] [ 1 0 0 ] [ 0 1 ] [ 1 0 0 ] [ 2/5 -4/5 ] [ 0 4/5 2/5 ] [ 1/5 -2/5 ] [ 0 2/5 1/5 ]`

Confirm this is the same answer we got from numpy (here).

Notice two key properties of P:

 `PT = P2 = P`

 ` [ 1 0 0 ] [ 0 4/5 2/5 ] [ 0 2/5 1/5 ] [ 1 0 0 ] [ 1 0 0 ] [ 0 4/5 2/5 ] [ 0 4/5 2/5 ] [ 0 2/5 1/5 ] [ 0 2/5 1/5 ]`

Now, multiply P times b (3,4,4) to get p:

 ` [ 3 ] [ 4 ] [ 4 ] [ 1 0 0 ] [ 3 ] [ 0 4/5 2/5 ] [ 24/5 ] [ 0 2/5 1/5 ] [ 12/5 ]`

And:

 ` e = b - p = (3,4,4) - (3,24/5,12/5) = (0,-4/5,8/5)`

Confirm:

 ` e • a = (0,-4/5,8/5) • (2,2,1) = 0 e • a* = (0,-4/5,8/5) • (1,0,0) = 0`

Another way of handling this does the steps in a different order:

 ` p = Ax = A (ATA)-1 AT b`

We first get (ATA)-1 as before:

 `AT A = [ 9 2 ] [ 2 1 ](ATA)-1 = 1/5 [ 1 -2 ] [ -2 9 ]`

Then we do:

 `AT b = [ 3 ] [ 4 ] [ 4 ] [ 2 2 1 ] [ 18 ] [ 1 0 0 ] [ 3 ]`

So

 ` x = (ATA)-1 AT b [ 18 ] [ 3 ] 1/5 [ 1 -2 ] [ 12/5 ] [ -2 9 ] [ -9/5 ] p = Ax [ 12/5 ] [ -9/5 ] [ 2 1 ] [ 3 ] [ 2 0 ] [ 24/5 ] [ 1 0 ] [ 12/5 ] e = b - p = (3,4,4) - (3,24/5,12/5) = (0,-4/5,8/5)`

It's reassuring to get the same result.

Check out lecture 15 on MIT's ocw (here). It's a work of genius.