Saturday, December 5, 2009

PyCogent 6: NCBI

More on PyCogent here. Here are three examples from the cookbook.

• EFetch to obtain a nucleotide sequence from the id
• EUtils to obtain a medline report from Pubmed
• Using a parser to process a Genbank file from disk


import sys
from cogent.db.ncbi import EFetch
from cogent.db.ncbi import EUtils
from cogent.parse.genbank import parse

def test1():
e = EFetch(db='nucleotide',
rettype='fasta',
id='154102')
result = e.read()
L = result.split('\n')
print L[0][:40]
print L[1][:40]

test1()


>gi|154102|gb|J04243.1|STYHEMAPRF S.typh
GGATCCACTGCCGCAGGCTGTTTAACGGAATCGGCATCCC



def test2():
e = EUtils(db='pubmed',
rettype='medline')
item = e['2544564'] # PMID
result = item.read()
print result.split('\n')[0]

test2()


PMID- 2544564


This requires a BioPython file from here.

def test3():
gb = parse(open('ls_orchid.gbk'))
print gb.Name
#print gb.Info
L = gb.Info['features']
L = [e for e in L if e['type'] == 'gene']
print L[0]['gene'][0]


Z78533
5.8S rRNA

Uncomment the line to see all of what is available.