Chapter 9. Python for Big Genomics Datasets

In this chapter, we will cover the following recipes:

  • Setting the stage for high-performance computing
  • Designing a poor human concurrent executor
  • Performing parallel computing with IPython
  • Computing the median in a large dataset
  • Optimizing code with Cython and Numba
  • Programming with laziness
  • Thinking with generators

Introduction

In this final chapter, we will discuss high-performance computing techniques for large computational biology datasets. We will talk about code parallelization, running software in clusters, code optimization, and advanced functional programming techniques. We will try to avoid tying any solution to a specific proprietary technology (for example, Amazon EC2) and design code that can be ...

Get Bioinformatics with Python Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.