This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
xiv
|
Preface
growing orbit. So if you’ve recently become interested in bioinformatics, understand-
ing BLAST is a great place to start. And if you’re already a bioinformatics student or
professional, this book can help you get more out of BLAST.
Structure of This Book
This book is divided into six parts: An Introduction to BLAST, Theory, Practice,
Industrial-Strength BLAST, Reference, and the Appendixes. The quick start guide in
Chapter 1 is the best place to begin if you’ve never run BLAST before. You won’t
need sophisticated hardware or software, just a web browser connected to the Inter-
net. In Part II, we begin by exploring the molecular biology, computer science, and
statistics that form the foundation of BLAST searches. We then describe the BLAST
algorithm in detail. You will find that a sound theoretical understanding is essential
when you put BLAST into practice. In Part III, we present practical advice to help
you design and interpret BLAST experiments intelligently and efficiently. Whether
you’re a complete novice or a seasoned pro, you’ll find the tutorials and protocols a
valuable resource. Part IV discusses using BLAST in a high-throughput setting where
the goal is to get as much BLAST as possible for your buck. Here, we integrate the
information usually found scattered among systems administrators, database admin-
istrators, and advanced BLAST users into a few sensible chapters. Part V contains
reference chapters for NCBI-BLAST and WU-BLAST with detailed descriptions of
each parameter.
Here’s a chapter-by-chapter breakdown:
Part I, Introduction
Chapter 1, Hello BLAST, gives a quick introduction to BLAST by exploring Internet
search pages.
Part II, Theory
Chapter 2, Biological Sequences, gives some background molecular and evolutionary
biology to help you understand why biological sequences are similar to one another.
Chapter 3, Sequence Alignment, explains how global and local sequence alignment
works and describes common algorithms for aligning sequences of letters.
Chapter 4, Sequence Similarity, explains how scores are used to determine the best
alignmentand discusses the statistical significance of sequence similarity in a data-
base search.
Part III, Practice
Chapter 5, BLAST, discusses BLAST itself. Understanding the theoretical framework
of the BLAST suite of programs will help you design and interpret BLAST experi-
ments and give you a foundation for troubleshooting when your search produces
unexpected results.
This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
Preface
|
xv
Chapter 6, Anatomy of a BLAST Report, explores the standard format of the BLAST
report.
Chapter 7, A BLAST Statistics Tutorial, shows how to calculate the numbers in a
BLAST report and use this knowledge to better understand the results of a BLAST
search.
Chapter 8, 20 Tips to Improve Your BLAST Searches, is a summary of the previous
seven chapters as well as the authors’ expertise, and is designed to help you get the
most from your BLAST searches.
Chapter 9, BLAST Protocols, contains “recipes” for the most common BLAST
searches; it describes what to do and why to do it.
Part IV, Industrial-Strength BLAST
Chapter 10, Installation and Command-Line Tutorial, shows how to install NCBI-
BLAST and WU-BLAST software on your own computer. This is necessary if you
want to use BLAST in a high-throughput setting or develop specialized applications.
Chapter 11, BLAST Databases, shows how to create and maintain BLAST data-
bases—one of the most neglected yet important aspects of using BLAST.
Chapter 12, Hardware and Software Optimizations, explores how to optimize BLAST
searches for maximum throughput and will help you get the most out of your cur-
rent and future hardware and software.
Part V, BLAST Reference
Chapter 13, NCBI-BLAST Reference, describes the parameters and options for the
NCBI suite of BLAST programs.
Chapter 14, WU-BLAST Reference, describes the parameters and options for the
WU-BLAST program.
Part VI: Appendixes
Appendix A, NCBI Display Formats, gives a brief description of each NCBI-BLAST
sequence alignment display option, followed by a detailed explanation and example.
Appendix B, Nucleotide Scoring Schemes, shows the target frequencies and simple
gap costs for pairs of sequences of length 100, 500, and 1,000.
Appendix C, NCBI-BLAST Scoring Schemes, shows the default values for several
combinations of NCBI-BLAST matrices and gap costs.
Appendix D, blast-imager.pl, is a Perl script that creates a graphical summary of a
BLAST report using Thomas Boutell’s GD graphics library, which has been ported to
Perl by Lincoln Stein.
Appendix E, blast2table.pl, is a Perl script that converts standard WU-BLAST or
NCBI-BLAST output to the NCBI tabular format (-m 8) described in Appendix A.
There is also a Glossary of BLAST terms.

Get BLAST now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.