Friday, November 19, 2010

RPy and ape working together

I couldn't resist doing a little more with RPy and the R ape package. I used our standard tree file. The code listing is below. I am quite pleased with the results. With this in hand, it will be nice to look more at the ape book, which I already have.

In the first section we do our imports, and get the R object 'T' (True), because I couldn't get True or 'True' to work as a direct argument to the plot function. We also read the tree file. Next, I demonstrate the form of the ape function drop_tip, which is very useful, but I don't show results or a plot here.

In the third part, I define a function to return a list of colors based on the names of the external nodes, and then later convert that to an appropriate R object. This is what had me so frustrated when I first tried ape.

In the last part, we do the plot to a pdf file. The x_lim argument is important, that's how we stretch out the plot along the x-axis to make enough room for the names. Two separate calls after plot show the internal node labels, and the x-axis values.

I agree this is really powerful. The best part is that for someone like me, when you have trouble doing something in R, you can drop into Python without a problem. It's easy when you know how!

One more thing, Laurent Gautier is big in RPy, and one of his interests is Bioconductor. That project is the reason I got into R in the first place. It'll be fun to explore that in the future.

from rpy2 import robjects
from rpy2.robjects.packages import importr
grdevices = importr('grDevices')
T = robjects.r['T']

ape = importr('ape')
fn = '/Users/telliott_admin/Desktop/tree.txt'
tr = ape.read_tree(fn)
print tr
# the ape function drop_tip
tip_list = robjects.IntVector([1,6])
#tr = ape.drop_tip(tr,tip_list)
# define some colors:
# this works but isn't very flexible
# col_list = robjects.r("rep(c('red','blue'),3)")

def get_color(n):
D = { 'Steno':'maroon','Kingella':'maroon',
'coli':'magenta','typhi':'magenta' }
for k in D:
if k in n: return D[k]
return 'darkgreen'

tip_labels = tr[2]
color_list = [get_color(n) for n in tip_labels]
color_list = robjects.StrVector(color_list)
ofn = '/Users/telliott_admin/Desktop/plot.pdf'