Chapter 9. Python for Big Genomics Datasets

In this chapter, we will cover the following recipes:

Setting the stage for high-performance computing
Designing a poor human concurrent executor
Performing parallel computing with IPython
Computing the median in a large dataset
Optimizing code with Cython and Numba
Programming with laziness
Thinking with generators

Introduction

In this final chapter, we will discuss high-performance computing techniques for large computational biology datasets. We will talk about code parallelization, running software in clusters, code optimization, and advanced functional programming techniques. We will try to avoid tying any solution to a specific proprietary technology (for example, Amazon EC2) and design code that can be ...

Get Bioinformatics with Python Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Bioinformatics with Python Cookbook by Tiago Antao

Chapter 9. Python for Big Genomics Datasets

Introduction

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly