There's more...

This is a simple recipe that exercises several concepts that have been presented in this and the previous chapter, Chapter 2, Next Generation Sequencing. While it's conceptually trivial, it's unfortunately full of booby traps.

When using different databases, be sure that the genome assembly versions are synchronized. It would be a serious and potentially silent bug to use different versions. Remember that different versions (at least on the major version number) have different coordinates. For example, position 1,234 on chromosome 3 on build 36 of the human genome will probably refer to a different SNP than 1,234 on build 38. With human data, you will probably find a lot of chips on build 36, and plenty of whole genome sequences ...

Get Bioinformatics with Python Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.