O'Reilly logo

Beginning Perl for Bioinformatics by James Tisdall

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Separating Sequence and Annotation

In previous chapters you saw how to examine the lines of a file using Perl's array operations. Usually, you do this by saving the data in an array with each line of the file appearing as an element of the array.

Let's look at two methods to extract the annotation and the DNA from a GenBank file. In the first method, you'll slurp the file into an array and look through the lines, as in previous programs. In the second, you'll put the whole GenBank record into a scalar variable and use regular expressions to parse the information. Is one approach better than the other? Not necessarily: it depends on the data. There are advantages and disadvantages to each, but both get the job done.

I've put five GenBank records in a file called library.gb. As before, you can download the file from this book's web site. You'll use library.gb, and the file record.gb which contains just one GenBank record, in the next few examples.

Using Arrays

Example 10-1 shows the first method, which operates on an array containing the lines of the GenBank record. The main program is followed by a subroutine that does the real work.

Example 10-1. Extract annotation and sequence from GenBank file

#!/usr/bin/perl # Extract annotation and sequence from GenBank file use strict; use warnings; use BeginPerlBioinfo; # see Chapter 6 about this module # declare and initialize variables my @annotation = ( ); my $sequence = ''; my $filename = 'record.gb'; parse1(\@annotation, \$sequence, $filename); ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required