You are previewing Bioinformatics with Python Cookbook.
O'Reilly logo
Bioinformatics with Python Cookbook

Book Description

Learn how to use modern Python bioinformatics libraries and applications to do cutting-edge research in computational biology

In Detail

If you are either a computational biologist or a Python programmer, you will probably relate to the expression "explosive growth, exciting times". Python is arguably the main programming language for big data, and the deluge of data in biology, mostly from genomics and proteomics, makes bioinformatics one of the most exciting fields in data science.

Using the hands-on recipes in this book, you'll be able to do practical research and analysis in computational biology with Python. We cover modern, next-generation sequencing libraries and explore real-world examples on how to handle real data. The main focus of the book is the practical application of bioinformatics, but we also cover modern programming techniques and frameworks to deal with the ever increasing deluge of bioinformatics data.

What You Will Learn

  • Gain a deep understanding of Python's fundamental bioinformatics libraries and be exposed to the most important data science tools in Python

  • Process genome-wide data with Biopython

  • Analyze and perform quality control on next-generation sequencing datasets using libraries such as PyVCF or PySAM

  • Use DendroPy and Biopython for phylogenetic analysis

  • Perform population genetics analysis on large datasets

  • Simulate complex demographies and genomic features with simuPOP

  • Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

    Table of Contents

    1. Bioinformatics with Python Cookbook
      1. Table of Contents
      2. Bioinformatics with Python Cookbook
      3. Credits
      4. About the Author
      5. About the Reviewers
      6. www.PacktPub.com
        1. Support files, eBooks, discount offers, and more
          1. Why Subscribe?
          2. Free Access for Packt account holders
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Sections
          1. Getting ready
          2. How to do it…
          3. There's more…
          4. See also
        5. Conventions
        6. Reader feedback
        7. Customer support
          1. Downloading the example code
          2. Downloading the color images of this book
          3. Errata
          4. Piracy
          5. Questions
      8. 1. Python and the Surrounding Software Ecology
        1. Introduction
        2. Installing the required software with Anaconda
          1. Getting ready
          2. How to do it…
          3. There's more…
          4. See also
        3. Installing the required software with Docker
          1. Getting ready
          2. How to do it…
          3. See also
        4. Interfacing with R via rpy2
          1. Getting ready
          2. How to do it…
          3. There's more…
          4. See also
        5. Performing R magic with IPython
          1. Getting ready
          2. How to do it…
          3. See also
      9. 2. Next-generation Sequencing
        1. Introduction
        2. Accessing GenBank and moving around NCBI databases
          1. Getting ready
          2. How to do it…
          3. There's more…
          4. See also
        3. Performing basic sequence analysis
          1. Getting ready
          2. How to do it…
          3. There's more…
          4. See also
        4. Working with modern sequence formats
          1. Getting ready
          2. How to do it…
          3. There's more…
          4. See also
        5. Working with alignment data
          1. Getting ready
          2. How to do it…
          3. There's more…
          4. See also
        6. Analyzing data in the variant call format
          1. Getting ready
          2. How to do it…
          3. There's more…
          4. See also
        7. Studying genome accessibility and filtering SNP data
          1. Getting ready
          2. How to do it…
          3. There's more…
          4. See also
      10. 3. Working with Genomes
        1. Introduction
        2. Working with high-quality reference genomes
          1. Getting ready
          2. How to do it...
          3. There's more...
          4. See also
        3. Dealing with low-quality genome references
          1. Getting ready
          2. How to do it...
          3. There's more...
          4. See also
        4. Traversing genome annotations
          1. Getting ready
          2. How to do it...
          3. There's more...
          4. See also
        5. Extracting genes from a reference using annotations
          1. Getting ready
          2. How to do it…
          3. There's more...
          4. See also
        6. Finding orthologues with the Ensembl REST API
          1. Getting ready
          2. How to do it...
          3. There's more...
        7. Retrieving gene ontology information from Ensembl
          1. Getting ready
          2. How to do it...
          3. There's more...
          4. See also
      11. 4. Population Genetics
        1. Introduction
        2. Managing datasets with PLINK
          1. Getting ready
          2. How to do it…
          3. There's more...
          4. See also
        3. Introducing the Genepop format
          1. Getting ready
          2. How to do it…
          3. See also
        4. Exploring a dataset with Bio.PopGen
          1. Getting ready
          2. How to do it…
          3. There's more...
          4. See also
        5. Computing F-statistics
          1. Getting ready
          2. How to do it…
          3. See also
        6. Performing Principal Components Analysis
          1. Getting ready
          2. How to do it…
          3. There's more...
          4. See also
        7. Investigating population structure with Admixture
          1. Getting ready
          2. How to do it…
          3. There's more...
      12. 5. Population Genetics Simulation
        1. Introduction
        2. Introducing forward-time simulations
          1. Getting ready
          2. How to do it...
          3. There's more...
        3. Simulating selection
          1. Getting ready
          2. How to do it…
          3. There's more...
        4. Simulating population structure using island and stepping-stone models
          1. Getting ready
          2. How to do it…
        5. Modeling complex demographic scenarios
          1. Getting ready
          2. How to do it…
        6. Simulating the coalescent with Biopython and fastsimcoal
          1. Getting ready
          2. How to do it…
          3. There's more...
          4. See also
      13. 6. Phylogenetics
        1. Introduction
        2. Preparing the Ebola dataset
          1. Getting ready
          2. How to do it...
          3. There's more...
          4. See also
        3. Aligning genetic and genomic data
          1. Getting ready
          2. How to do it...
        4. Comparing sequences
          1. Getting ready
          2. How to do it...
          3. There's more…
        5. Reconstructing phylogenetic trees
          1. Getting ready
          2. How to do it...
          3. There's more...
        6. Playing recursively with trees
          1. Getting ready
          2. How to do it...
          3. There's more...
        7. Visualizing phylogenetic data
          1. Getting ready
          2. How to do it...
          3. There's more...
      14. 7. Using the Protein Data Bank
        1. Introduction
        2. Finding a protein in multiple databases
          1. Getting ready
          2. How to do it...
          3. There's more...
        3. Introducing Bio.PDB
          1. Getting ready
          2. How to do it...
          3. There's more...
        4. Extracting more information from a PDB file
          1. Getting ready
          2. How to do it...
        5. Computing molecular distances on a PDB file
          1. Getting ready
          2. How to do it...
        6. Performing geometric operations
          1. Getting ready
          2. How to do it...
          3. There's more...
        7. Implementing a basic PDB parser
          1. Getting ready
          2. How to do it...
          3. There's more...
        8. Animating with PyMol
          1. Getting ready
          2. How to do it...
          3. There's more...
        9. Parsing mmCIF files using Biopython
          1. Getting ready
          2. How to do it...
          3. There's more...
      15. 8. Other Topics in Bioinformatics
        1. Introduction
        2. Accessing the Global Biodiversity Information Facility
          1. How to do it...
          2. There's more...
        3. Geo-referencing GBIF datasets
          1. Getting ready
          2. How to do it...
          3. There's more...
        4. Accessing molecular-interaction databases with PSIQUIC
          1. How to do it...
        5. Plotting protein interactions with Cytoscape the hard way
          1. Getting ready
          2. How to do it...
          3. There's more...
      16. 9. Python for Big Genomics Datasets
        1. Introduction
        2. Setting the stage for high-performance computing
          1. Getting ready
          2. How to do it...
        3. Designing a poor human concurrent executor
          1. Getting ready
          2. How to do it...
          3. There's more...
        4. Performing parallel computing with IPython
          1. Getting ready
          2. How to do it...
          3. There's more...
        5. Computing the median in a large dataset
          1. Getting ready
          2. How to do it...
          3. There's more...
        6. Optimizing code with Cython and Numba
          1. Getting ready
          2. How to do it...
          3. There's more...
        7. Programming with laziness
          1. Getting ready
          2. How to do it...
          3. There's more...
        8. Thinking with generators
          1. Getting ready
          2. How to do it...
          3. See also
      17. Index