This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
xi
Foreword
Reading a book such as this brings home how much BLAST—now in its teenage
years—has grown, and provides an occasion for fond reflection. BLAST was born in the
first months of 1989 at the National Center for Biotechnology Information (NCBI). The
Center had been created at the National Institutes of Health in November 1988, by an
act of the U.S. Congress, to foster the development of a field that then had no widely
accepted name, but which has since come to be known as “Bioinformatics.” In early
1989, David Lipman, my post-doctoral advisor, who at the time was perhaps best
known as a codeveloper of the FASTA program, was appointed director of NCBI. On
the first of March we moved into new offices at the National Library of Medicine.The
NCBI was small, but had large ambitions, and already a number of friends. Several of
these well-wishers made it a point to drop by for a visit. Gene Myers, a computer scien-
tist then at Arizona, arrived during a week in which Science was hyping a special-pur-
pose computer chip for sequence comparison. He and David, software partisans both,
were unimpressed and over dinner resolved to do better. Their original idea was to find
not subtle sequence similarities, but fairly obvious ones, and to do it in a flash. Gene
pursued a rigorous approach at first, but David, with a fine Darwinian wisdom, was
willing to settle for imperfection. If one were to gamble, what kind of match could one
expect a strong alignment to contain? Detailed algorithmic and code development on
BLAST by Webb Miller—later to be joined by Warren Gish—had hardly begun before
Sam Karlin, a Stanford mathematician, came calling. I had approached him a few
months earlier with a conjecture concerning the asymptotic behavior of optimal
ungapped local sequence alignments. Since then, he had spun this conjecture into a
beautiful theory. Now, for the first time, rigorous statistics were available for alignment
scoring systems of more than academic interest, and the essential nature of amino acid
substitution matrices also began to come into clear focus. This theory dovetailed per-
fectly with the work that had just started on BLAST: both informing the selection of its
algorithmic parameters, and yielding units for the alignment scores produced.
Although David chose BLAST’s name as a bit of a pun on “FASTA” (it was only later
that I realized “BLAST” to be an acronym), the new program was never intended to vie
with the earlier one. Rather, the idea was to turn the “threshold parameter” way up, to
find undoubted homologies before you take more than one sip of coffee. It surprised

Get BLAST now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.