This is the continuation of a project using the Gene Ontology (first post here). For this part, you'll need to get the annotations associated with the yeast genome---at least that's what I used (here). In the project files (link below) you'll find a short script that loads the data from this file. It expects to find the file in the
db
folder.Another short script
useGO.py
just exercises things a bit. We load the GO data and the yeast annotations. Given a target list (in this case ['pheromone']), then we look for all the yeast genes containing that word in the description field (at index 9 of the original yeast db file). We recover these GO ids and print all the applicable GO terms, obtained using the recursive code from the first post.Sample output shows a single one of the genes found:
I think you can see what GO is supposed to be about. We gradually progress to more and more general categories as we work our way up the tree.
What's not obvious in the approach I used so far is that these chains of terms end with one of three different major categories. These are:
The other thing is that I've obscured the branching, but I have a modification to the code that gives this information. And I have a graph that plots it. More in a later post. Zipped project files here.