Friday, December 4, 2009

PyCogent: baby steps

Rob Knight (at the University of Colorado) has been on my radar for a while now because of his involvement in a number of metagenomics projects, like with Jeff Gordon. Besides a solid statistics and phylogenetics background, he brings an enthusiasm for Python to the mix, and I find that encouraging since I can at least hope to figure out what he's up to. I have already used UniFrac for a bacterial census project I'm involved in, and I mean to post about that sometime later---like when we have actually written the paper!

Rob is also a developer of PyCogent (PMID 17708774), billed as a "toolkit for making sense from sequence." I read the paper a year ago, but didn't get into it at the time because it seemed like a pretty complex undertaking.

Now, my desire has become more pressing because of a new paper that is coming out from Rob which integrates the PyCogent framework with NAST, called PyNast (PMID 19914921). I want to learn how to use template alignments.

I followed the instructions for "Quick installation" of PyCogent here. I ran into a problem with MySQL that I'm still working on. And there were other apparent errors in the pip-log file that led me to go back and try to re-install each component one-at-a-time. This seemed to work (except for MySQL, which needs mysql_config), but when I looked for tests they were not present. It seems that these don't end up in the site-packages directories. Since I wasn't confident that I'd actually installed PyCogent correctly, I used subversion to grab the source and do it again:

svn co PyCogent

and followed the instructions in the README. That seems to have worked since this:

sh run_tests


Can't find blastall executable: skipping test
... [snip] ...
--- [snip] ---
Ran 3403 tests in 185.362s


Obviously, I will have to let PyCogent know where executables live, but otherwise, that looks reassuring. However, the data file for the very first tutorial is not present in what I have on my machine. Seems there are a few rough spots to work through. I'll keep you posted.