There's more...

The purpose of this recipe is to get you up to speed with the PyVCF module. At this stage, you should be comfortable with the API. We will not spend too much time on usage details because this will be the main purpose of the next recipe: using the VCF module to study the quality of your variant calls.

It will probably not be a shocking revelation that PyVCF is not the fastest module on earth. The file format (highly text-based) makes processing a time-consuming task. There are two main strategies for dealing with this problem. One strategy is parallel processing, which we will discuss in the last chapter, Chapter 9, Python for Big Genomics Datasets. The second strategy is to convert to a more efficient format; we will provide ...

Get Bioinformatics with Python Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.