Introduction

If you work with next-generation sequencing (NGS) data, you know that quality analysis and processing are two of the great time-sinks in getting results.

In this chapter, we will delve deeper into NGS analysis by using a dataset that includes information about relatives; in our case, a mother, a father, and around 20 offspring. This is a common technique for performing quality analysis, as pedigree information will allow us to make inferences on the amount of errors that our filtering rules might produce. We will be using HDF5 representing VCF files. We also introduce a bit more of NumPy and pandas in this chapter.

Get Bioinformatics with Python Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.