Saturday, November 27, 2010

Phylogenetic tree surgery 1

One of the biggest problems in phylogenetic analysis is the very large number of possible trees. I don't think I've posted about that yet, but you probably appreciate that for N species the number of trees is approximately 10N. (Speaking very roughly). For 25 species we're talking more than a mole of trees. That's a lot!

One consequence is that we can never evaluate all the possible trees for any problem of significant size. Furthermore, if we have a tree that we like (it's the one with the maximum likelihood that we've seen so far), it becomes important to look at its near neighbors in "tree space" and see if they are any better. Felsenstein has a great discussion of this in Chapter 4.

I want to study a bit about the basic methods of tree surgery. My hunch is that representing trees in the way that was introduced previously (here), as lists of connections to each internal node, will make this pretty easy. My goal in this post is simply to present one example of each of the methods that I know about, and show the reorganization of the list that results. The examples are shown in the figures before, which were constructed in Keynote. We'll explore these in future posts, as time permits. The plan is to develop rules to emulate the types of list reorganization observed here, and see if the structure of the resulting trees can be explained by that type of surgery.

In each example, the list entries that change are highlighted.

In NNI we pick a branch, and rearrange the quartet of clades to form a new tree. There are two possible rearrangements. Here is one of them for the 1-3 branch.

In SPR we imagine cutting in the middle of one branch, and then connecting one of the pieces to the middle of a second branch in the other piece. I had to relabel the internal nodes for this one, I did it in a way that makes the changes to the list easiest.

In TBR we cut a branch and dissolve the stubs, then pick one branch in each of the smaller pieces, and connect there.