## Thursday, July 23, 2009

### Phylogenetic Trees: making the plot

We're on the last leg of the journey now. Having calculated all the relevant values we are ready to make the plot. I used R to do this so you can follow along if you want.

I find it is easier for me to write R code from Python to be executed later than it is to write R code as text that needs to sort or manipulate values to do various things. And I really doubt that you care exactly how I achieved this. (Let me know if you want a copy of the Python script). Here is a small taste of it.

Some code I write as text and load into Python to combine with the other parts:

 `code = open('src.Rcode.txt').read()code = code.strip().split('\n\n')FH.write(code[0] + '\n')`

In this part, we go through each eNode, recover its x-position and the x-position of its parent node, and write the instruction that will draw a line connecting it those two points.

 `# coordinates for the lines and textfor k in eL: # horizontal p = rT[k] # parent x1 = str(round(xD[k],5)) x0 = str(round(xD[p],5)) # parent's x y = str(round(yD[k],2)) FH.write('lines(c(' + x0 + ',') FH.write(x1 + '),c(') FH.write(y + ',' + y + '))\n')`

The ta-ta-ta-taa (Melvin video at 2:51) figure is here:

You can compare it with the original plot by the APE package:

And here is the listing for the R code:

 `ex=c(0.08401,0.08853,0.0595,0.08516,0.0871,0.09976)ey=c(1.0,2.0,3.0,4.0,5.0,6.0)ix=c(0.00827,0.07219,0.03863,0.0)iy=c(1.5,4.5,5.25,3.0)par(cex=1.5,yaxt='n')X = max(ex) + 0.15Y = max(c(max(ey),max(iy))) + 2plot(ex,ey,xlim=c(0,X), ylim=c(0,Y),type="n", xlab='',ylab='')lines(c(0.00827,0.08401),c(1.0,1.0))text(0.08401,1.0,label="Stenotrophomonas maltophilia",pos=4,font=3)lines(c(0.00827,0.08853),c(2.0,2.0))text(0.08853,2.0,label="Kingella oralis",pos=4,font=3)lines(c(0.0,0.0595),c(3.0,3.0))text(0.0595,3.0,label="Pseudomonas aeruginosa",pos=4,font=3)lines(c(0.07219,0.08516),c(4.0,4.0))text(0.08516,4.0,label="Salmonella typhi",pos=4,font=3)lines(c(0.07219,0.0871),c(5.0,5.0))text(0.0871,5.0,label="Escherichia coli",pos=4,font=3)lines(c(0.03863,0.09976),c(6.0,6.0))text(0.09976,6.0,label="Haemophilus parainfluenzae",pos=4,font=3)lines(c(0.0,0.00827),c(1.5,1.5))lines(c(0.03863,0.07219),c(4.5,4.5))lines(c(0.0,0.03863),c(5.25,5.25))lines(c(0.00827,0.00827),c(1.0,2.0))lines(c(0.07219,0.07219),c(4.0,5.0))lines(c(0.03863,0.03863),c(4.5,6.0))lines(c(0.0,0.0),c(1.5,5.25))points(ex,ey,pch=16,col="red")points(ix,iy,pch=16,col="blue")`