How to do it...

Let's take a look at the following steps:

  1. Let's start by retrieving the annotation information for our gene:
import gzipfrom Bio import Alphabet, Seq, SeqIOgene_id = 'AGAP004707'gene = db[gene_id]print(gene)print(gene.seqid, gene.strand)

The gene_id was retrieved from VectorBase, an online database of the genomics of disease vectors. For other specific cases, you will need to know the ID of your gene (which will be dependent on species and database). The output will be as follows:

2L VectorBase gene 2358158 2431617 . + . ID=AGAP004707;biotype=protein_coding 2L +

Note that the gene is on the 2L chromosome arm and coded in the positive direction (+ strand).

  1. Let's hold the sequence for the 2L chromosome arm in memory (it's ...

Get Bioinformatics with Python Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.