In Joe Felsenstein's book, Inferring Phylogenies, he uses some data from Vincent Sarich to demonstrate UPGMA. Although I was tempted to "declare victory and go home", I decided to test my version of UPGMA with that data. And that turned up bugs in both the main
upgma
and the plotting code! So I guess the moral is: test, test, and test again.The top figure is a scan from the book. Here is what my program plotted:
It looks pretty good to me.
The data are immunological data from eight species in this order: dog, bear, raccoon, weasel, seal, sea lion, cat, monkey. The data are in this distance matrix:
If you grab the zipped files (here) and run them, you'll see a lot of diagnostic output when
debug == True
. As well as all the branch lengths.