How to do it…

After downloading the data, follow these steps:

  1. First, start with a few imports:
import pickleimport gzipimport randomimport numpy as npimport h5pyimport pandas as pd
  1. Let's get the sample metadata:
samples = pd.read_csv('samples.tsv', sep='\t')print(len(samples))print(samples['cross'].unique())print(samples[samples['cross'] == 'cross-29-2'][['id', 'function']])print(len(samples[samples['cross'] == 'cross-29-2']))print(samples[samples['function'] == 'parent'])

We also print some basic information about the cross we are going to use, and all the parents.

  1. We prepare to deal with chromosome arm 3L based on its HDF5 file:
h5_3L = h5py.File('ag1000g.crosses.phase1.ar3sites.3L.h5', 'r')samples_hdf5 = list(map(lambda sample: sample.decode('utf-8'), ...

Get Bioinformatics with Python Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.