Cover image for Developing Bioinformatics Computer Skills

Book description

Bioinformatics--the application of computational and analytical methods to biological problems--is a rapidly evolving scientific discipline. Genome sequencing projects are producing vast amounts of biological data for many different organisms, and, increasingly, storing these data in public databases. Such biological databases are growing exponentially, along with the biological literature. It's impossible for even the most zealous researcher to stay on top of necessary information in the field without the aid of computer-based tools. Bioinformatics is all about building these tools. Developing Bioinformatics Computer Skills is for scientists and students who are learning computational approaches to biology for the first time, as well as for experienced biology researchers who are just starting to use computers to handle their data. The book covers the Unix file system, building tools and databases for bioinformatics, computational approaches to biological problems, an introduction to Perl for bioinformatics, data mining, and data visualization. Written in a clear, engaging style, Developing Bioinformatics Computer Skills will help biologists develop a structured approach to biological data as well as the tools they'll need to analyze the data.

Table of Contents

  1. Copyright
  2. Preface
    1. Audience for This Book
    2. Structure of This Book
    3. Our Approach to Bioinformatics
    4. URLs Referenced in This Book
    5. Conventions Used in This Book
    6. Comments and Questions
    7. Acknowledgments
  3. I. Introduction
    1. 1. Biology in the Computer Age
      1. 1.1. How Is Computing Changing Biology?
        1. 1.1.1. The Eye of the Fly
        2. 1.1.2. Labels in Gene Sequences
        3. 1.1.3. Comparing eyeless and aniridia with BLAST
      2. 1.2. Isn't Bioinformatics Just About Building Databases?
        1. 1.2.1. The First Information Age in Biology
      3. 1.3. What Does Informatics Mean to Biologists?
      4. 1.4. What Challenges Does Biology Offer Computer Scientists?
      5. 1.5. What Skills Should a Bioinformatician Have?
      6. 1.6. Why Should Biologists Use Computers?
        1. 1.6.1. A New Approach to Data Collection
      7. 1.7. How Can I Configure a PC to Do Bioinformatics Research?
        1. 1.7.1. Why Use Unix or Linux?
      8. 1.8. What Information and Software Are Available?
        1. 1.8.1. Why Do I Need to Install a Program from the Web?
      9. 1.9. Can I Learn a Programming Language Without Classes?
      10. 1.10. How Can I Use Web Information?
      11. 1.11. How Do I Understand Sequence Alignment Data?
      12. 1.12. How Do I Write a Program to Align Two Biological Sequences?
      13. 1.13. How Do I Predict Protein Structure from Sequence?
      14. 1.14. What Questions Can Bioinformatics Answer?
    2. 2. Computational Approaches to Biological Questions
      1. 2.1. Molecular Biology's Central Dogma
        1. 2.1.1. Replication of DNA
        2. 2.1.2. Genomes and Genes
        3. 2.1.3. Transcription of DNA
        4. 2.1.4. Translation of mRNA
        5. 2.1.5. Molecular Evolution
      2. 2.2. What Biologists Model
        1. 2.2.1. Accessing 3D Molecules Through a 1D Representation
        2. 2.2.2. Abstractions for Modeling Protein Structure
        3. 2.2.3. Mathematical Modeling of Biochemical Systems
      3. 2.3. Why Biologists Model
      4. 2.4. Computational Methods Covered in This Book
      5. 2.5. A Computational Biology Experiment
        1. 2.5.1. Identifying the Problem
        2. 2.5.2. Separating the Problem into Simpler Components
        3. 2.5.3. Evaluating Your Needs
        4. 2.5.4. Selecting the Appropriate Data Set
        5. 2.5.5. Identifying the Criteria for Success
        6. 2.5.6. Performing and Documenting a Computational Experiment
          1. 2.5.6.1. Documentation issues in computational biology
          2. 2.5.6.2. Electronic notebooks
  4. II. The Bioinformatics Workstation
    1. 3. Setting Up Your Workstation
      1. 3.1. Working on a Unix System
        1. 3.1.1. What Does an Operating System Do?
        2. 3.1.2. Why Use Unix?
        3. 3.1.3. Different Flavors of Unix
          1. 3.1.3.1. Linux
            1. 3.1.3.1.1. Will Linux run on your computer?
          2. 3.1.3.2. Other common flavors
        4. 3.1.4. Graphical Interfaces for Unix
      2. 3.2. Setting Up a Linux Workstation
        1. 3.2.1. Installing Linux
          1. 3.2.1.1. System requirements
          2. 3.2.1.2. Partitioning your disk
          3. 3.2.1.3. Selecting major package groupings
          4. 3.2.1.4. Other useful packages to add
      3. 3.3. How to Get Software Working
        1. 3.3.1. Unix tar Archives
        2. 3.3.2. Binary Distributions
        3. 3.3.3. RPM Archives
          1. 3.3.3.1. GnoRPM
        4. 3.3.4. Source Distributions
        5. 3.3.5. Perl Scripts
        6. 3.3.6. Putting It in Your Path
        7. 3.3.7. Sharing Software Among Multiple Users
      4. 3.4. What Software Is Needed?
    2. 4. Files and Directories in Unix
      1. 4.1. Filesystem Basics
        1. 4.1.1. Moving Around the Directory Hierarchy
        2. 4.1.2. Paths to Files and Directories
        3. 4.1.3. Using a Process-Based File Hierarchy
        4. 4.1.4. Establishing File-Naming Conventions for Your Work
        5. 4.1.5. Structuring a Project: An Example
      2. 4.2. Commands for Working with Directories and Files
        1. 4.2.1. Moving Around the Filesystem
          1. 4.2.1.1. You are here: pwd
          2. 4.2.1.2. Changing directories with cd
        2. 4.2.2. Finding Files and Directories
          1. 4.2.2.1. Listing files with ls
          2. 4.2.2.2. Interpreting ls output
          3. 4.2.2.3. Finding files with find
          4. 4.2.2.4. Finding an executable file with which
          5. 4.2.2.5. Finding an executable file with whereis
        3. 4.2.3. Manipulating Files and Directories
          1. 4.2.3.1. Copying files and directories with cp
          2. 4.2.3.2. Moving and renaming files and directories with mv
          3. 4.2.3.3. Creating new links to files and directories with ln
          4. 4.2.3.4. Creating and removing directories with mkdir and rmdir
          5. 4.2.3.5. Removing files with rm
      3. 4.3. Working in a Multiuser Environment
        1. 4.3.1. Users and Groups
        2. 4.3.2. User Directories
        3. 4.3.3. File Permissions and Statistics
          1. 4.3.3.1. Viewing file attributes with stat
          2. 4.3.3.2. Changing file ownership and permissions with chmod
          3. 4.3.3.3. Changing file and directory ownership with chown and chgrp
        4. 4.3.4. System Administration
        5. 4.3.5. Conventions for Organizing Files
        6. 4.3.6. Locating Files in System Directories
    3. 5. Working on a Unix System
      1. 5.1. The Unix Shell
        1. 5.1.1. What Flavors of Shell Are There?
      2. 5.2. Issuing Commands on a Unix System
        1. 5.2.1. The Command-Line Format
        2. 5.2.2. Unix Information Commands
        3. 5.2.3. Standard Input and Output
        4. 5.2.4. Redirection of Command Input and Output
        5. 5.2.5. Operators
        6. 5.2.6. Wildcard Characters
        7. 5.2.7. Running X Commands
      3. 5.3. Viewing and Editing Files
        1. 5.3.1. Viewing and Combining Files with cat
        2. 5.3.2. more: A Step in the Right Direction
        3. 5.3.3. less: The Gold Standard
        4. 5.3.4. Editing Files with vi and vim
        5. 5.3.5. The GNU Emacs Editor
        6. 5.3.6. Viewing Binary Files with strings
        7. 5.3.7. od and Binary Data
      4. 5.4. Transformations and Filters
        1. 5.4.1. Extracting the Beginning of a File with head
        2. 5.4.2. Extracting the End of a File with tail
        3. 5.4.3. Splitting Files with split and csplit
        4. 5.4.4. Separating File Components with cut
        5. 5.4.5. Combining Files with paste
        6. 5.4.6. Merging Datafiles with join
        7. 5.4.7. Sorting Files with sort
          1. 5.4.7.1. Specifying sort keys
      5. 5.5. File Statistics and Comparisons
        1. 5.5.1. Comparing Files with cmp and diff
        2. 5.5.2. Counting Words with wc
      6. 5.6. The Language of Regular Expressions
        1. 5.6.1. Searching for Patterns with grep
      7. 5.7. Unix Shell Scripts
      8. 5.8. Communicating with Other Computers
        1. 5.8.1. The Web
        2. 5.8.2. IP Addresses and Hostnames
        3. 5.8.3. telnet
        4. 5.8.4. ftp
        5. 5.8.5. Displaying from a Remote Terminal
        6. 5.8.6. Communication and File Sharing
        7. 5.8.7. Media Compatibility
        8. 5.8.8. Accessing Devices as Unix Filesystems
        9. 5.8.9. Accessing Devices as DOS Disks
      9. 5.9. Playing Nicely with Others in a Shared Environment
        1. 5.9.1. Processes and Process Management
          1. 5.9.1.1. Checking the load average
          2. 5.9.1.2. Listing processes with ps
          3. 5.9.1.3. top
          4. 5.9.1.4. Signaling processes with kill
          5. 5.9.1.5. Setting process priorities with nice and renice
        2. 5.9.2. Scheduling Recurring Activities with cron
          1. 5.9.2.1. Submitting jobs to cron using crontab
          2. 5.9.2.2. Using cron to schedule a recurrent database search
          3. 5.9.2.3. Scheduling processes with batch and at
        3. 5.9.3. Monitoring Space Usage and File Sizes
          1. 5.9.3.1. Checking disk usage with du
          2. 5.9.3.2. Checking for free disk space with df
          3. 5.9.3.3. Checking your compliance with system quotas with quota
        4. 5.9.4. Creating Archives of Your Data
          1. 5.9.4.1. tar: Hold the feathers
          2. 5.9.4.2. compress
          3. 5.9.4.3. gzip
  5. III. Tools for Bioinformatics
    1. 6. Biological Research on the Web
      1. 6.1. Using Search Engines
        1. 6.1.1. Boolean Searching
        2. 6.1.2. Search Engine Algorithms
      2. 6.2. Finding Scientific Articles
        1. 6.2.1. Using PubMed Effectively
      3. 6.3. The Public Biological Databases
        1. 6.3.1. Data Annotation and Data Formats
        2. 6.3.2. 3D Molecular Structure Data
        3. 6.3.3. DNA, RNA, and Protein Sequence Data
        4. 6.3.4. Genomic Data
        5. 6.3.5. Biochemical Pathway Data
        6. 6.3.6. Gene Expression Data
      4. 6.4. Searching Biological Databases
        1. 6.4.1. GenBank
          1. 6.4.1.1. Saving search results
          2. 6.4.1.2. Saving large result sets
        2. 6.4.2. PDB
      5. 6.5. Depositing Data into the Public Databases
        1. 6.5.1. GenBank Deposition
        2. 6.5.2. PDB Deposition
      6. 6.6. Finding Software
      7. 6.7. Judging the Quality of Information
        1. 6.7.1. Authority
        2. 6.7.2. Transparency
        3. 6.7.3. Timeliness
    2. 7. Sequence Analysis, Pairwise Alignment, and Database Searching
      1. 7.1. Chemical Composition of Biomolecules
      2. 7.2. Composition of DNA and RNA
      3. 7.3. Watson and Crick Solve the Structure of DNA
      4. 7.4. Development of DNA Sequencing Methods
        1. 7.4.1. The Chemical Composition of Proteins
        2. 7.4.2. Mechanisms of Molecular Evolution
      5. 7.5. Genefinders and Feature Detection in DNA
        1. 7.5.1. Predicting Gene Locations
        2. 7.5.2. Feature Detection
      6. 7.6. DNA Translation
      7. 7.7. Pairwise Sequence Comparison
        1. 7.7.1. Scoring Matrices
        2. 7.7.2. Gap Penalties
        3. 7.7.3. Dynamic Programming
        4. 7.7.4. Global Alignment
          1. 7.7.4.1. Using ALIGN to produce a global sequence alignment
        5. 7.7.5. Local Alignment
          1. 7.7.5.1. Tools for local alignment
      8. 7.8. Sequence Queries Against Biological Databases
        1. 7.8.1. Local Alignment-Based Searching Using BLAST
          1. 7.8.1.1. The BLAST algorithm
          2. 7.8.1.2. NCBI BLAST and WU-BLAST
          3. 7.8.1.3. What do the various BLAST programs do?
          4. 7.8.1.4. Building a local database with formatdb
          5. 7.8.1.5. Evaluating BLAST results
        2. 7.8.2. Local Alignment Using FASTA
          1. 7.8.2.1. The FASTA algorithm
          2. 7.8.2.2. The FASTA programs
      9. 7.9. Multifunctional Tools for Sequence Analysis
        1. 7.9.1. NCBI SEALS
        2. 7.9.2. The Biology Workbench
        3. 7.9.3. DoubleTwist
    3. 8. Multiple Sequence Alignments, Trees, and Profiles
      1. 8.1. The Morphological to the Molecular
      2. 8.2. Multiple Sequence Alignment
        1. 8.2.1. Progressive Strategies for Multiple Alignment
        2. 8.2.2. Multiple Alignment with ClustalW
        3. 8.2.3. Viewing and Editing Alignments with Jalview
        4. 8.2.4. Sequence Logos
      3. 8.3. Phylogenetic Analysis
        1. 8.3.1. Phylogenetic Trees Based on Pairwise Distances
        2. 8.3.2. Phylogenetic Trees Based on Neighbor Joining
        3. 8.3.3. Phylogenetic Trees Based on Maximum Parsimony
        4. 8.3.4. Phylogenetic Trees Based on Maximum Likelihood Estimation
        5. 8.3.5. Software for Phylogenetic Analysis
          1. 8.3.5.1. PHYLIP
            1. 8.3.5.1.1. The PHYLIP input format
          2. 8.3.5.2. Generating input for PHYLIP with ClustalX
      4. 8.4. Profiles and Motifs
        1. 8.4.1. Motif Databases
          1. 8.4.1.1. Blocks
          2. 8.4.1.2. PROSITE
          3. 8.4.1.3. Pfam
          4. 8.4.1.4. PRINTS
          5. 8.4.1.5. COG
          6. 8.4.1.6. Accessing multiple databases
        2. 8.4.2. Constructing and Using Your Own Profiles
          1. 8.4.2.1. Finding new motifs with MEME
          2. 8.4.2.2. Searching for motifs with MAST and MetaMEME
          3. 8.4.2.3. Motif discovery with other programs
          4. 8.4.2.4. HMMer
        3. 8.4.3. Incorporating Motif Information into Pairwise Alignment
    4. 9. Visualizing Protein Structures and Computing Structural Properties
      1. 9.1. A Word About Protein Structure Data
      2. 9.2. The Chemistry of Proteins
        1. 9.2.1. From 1D to 3D
        2. 9.2.2. Interatomic Forces and Protein Structure
          1. 9.2.2.1. Covalent interactions
          2. 9.2.2.2. Hydrogen bonds
          3. 9.2.2.3. Hydrophobic and hydrophilic interactions
          4. 9.2.2.4. Charge-charge, charge-dipole, and dipole-dipole interactions
          5. 9.2.2.5. Van der Waals forces
          6. 9.2.2.6. Repulsive forces
          7. 9.2.2.7. Relative strength of interatomic forces
      3. 9.3. Web-Based Protein Structure Tools
      4. 9.4. Structure Visualization
        1. 9.4.1. Molecular Structure Viewers for Your Web Browser
          1. 9.4.1.1. RasMol
          2. 9.4.1.2. Cn3D
          3. 9.4.1.3. SWISS-PDBViewer
        2. 9.4.2. Standalone Modeling Packages
          1. 9.4.2.1. MolMol
          2. 9.4.2.2. MidasPlus
          3. 9.4.2.3. VMD
        3. 9.4.3. Creating High-Quality Graphics with MolScript
        4. 9.4.4. Active Site Visualization with LIGPLOT
        5. 9.4.5. dimplot
      5. 9.5. Structure Classification
        1. 9.5.1. Secondary Structure from Coordinates
          1. 9.5.1.1. STRIDE
        2. 9.5.2. Topology Cartoons
          1. 9.5.2.1. TOPS
        3. 9.5.3. Classification Databases
          1. 9.5.3.1. SCOP
          2. 9.5.3.2. CATH
          3. 9.5.3.3. Unique protein structure data sets
      6. 9.6. Structural Alignment
        1. 9.6.1. Comparing Two Protein Structures
          1. 9.6.1.1. ProFit
        2. 9.6.2. DALI Domain Dictionary
        3. 9.6.3. CE and CL
        4. 9.6.4. VAST
      7. 9.7. Structure Analysis
        1. 9.7.1. Analyzing Structure Quality
          1. 9.7.1.1. PROCHECK
          2. 9.7.1.2. WHAT IF/ WHAT CHECK
        2. 9.7.2. Intramolecular Interactions
          1. 9.7.2.1. Computing contacts with HBPLUS
      8. 9.8. Solvent Accessibility and Interactions
        1. 9.8.1. Computing Solvent Accessibility with naccess
        2. 9.8.2. Solvent Accessibility with Alpha Shapes
      9. 9.9. Computing Physicochemical Properties
        1. 9.9.1. Macromolecular Electrostatics
        2. 9.9.2. Visualization of Molecular Surfaces with Mapped Properties
          1. 9.9.2.1. GRASP/GRASS
      10. 9.10. Structure Optimization
        1. 9.10.1. Informatics Plays a Role in Optimization
        2. 9.10.2. Rotamer Libraries
        3. 9.10.3. PDFs
      11. 9.11. Protein Resource Databases
        1. 9.11.1. GeneCensus
        2. 9.11.2. PRESAGE
        3. 9.11.3. BIND
      12. 9.12. Putting It All Together
    5. 10. Predicting Protein Structure and Function from Sequence
      1. 10.1. Determining the Structures of Proteins
        1. 10.1.1. Solving Protein Structures by X-ray Crystallography
        2. 10.1.2. Solving Structures by NMR Spectroscopy
      2. 10.2. Predicting the Structures of Proteins
        1. 10.2.1. CASP: The Search for the Holy Grail
      3. 10.3. From 3D to 1D
      4. 10.4. Feature Detection in Protein Sequences
      5. 10.5. Secondary Structure Prediction
        1. 10.5.1. Alignment-Based and Hybrid Methods
        2. 10.5.2. Single Sequence Prediction Methods
        3. 10.5.3. Measuring Prediction Accuracy
        4. 10.5.4. Putting Predictions to Use
        5. 10.5.5. Predicting Transmembrane Helices
        6. 10.5.6. Threading
      6. 10.6. Predicting 3D Structure
        1. 10.6.1. Homology Modeling
          1. 10.6.1.1. Modeller
          2. 10.6.1.2. How Modeller builds a model
          3. 10.6.1.3. ModBase: a database of automatically generated models
          4. 10.6.1.4. The SWISS-MODEL server
        2. 10.6.2. Tools for Ab-Initio Prediction
      7. 10.7. Putting It All Together: A Protein Modeling Project
        1. 10.7.1. Finding Homologous Structures
        2. 10.7.2. Looking for Distant Homologies
        3. 10.7.3. Predicting Secondary Structure from Sequence
        4. 10.7.4. Using Threading Methods to Find Potential Folds
        5. 10.7.5. Using Profile Methods to Align Distantly Related Sequences
        6. 10.7.6. Building a Homology Model
      8. 10.8. Summary
    6. 11. Tools for Genomics and Proteomics
      1. 11.1. From Sequencing Genes to Sequencing Genomes
        1. 11.1.1. Analysis of Raw Sequence Data: Basecalling
        2. 11.1.2. Sequencing an Entire Genome
          1. 11.1.2.1. The shotgun approach
          2. 11.1.2.2. The clone contig approach
          3. 11.1.2.3. LIMS: Tracking all those minisequences
      2. 11.2. Sequence Assembly
      3. 11.3. Accessing Genome Informationon the Web
        1. 11.3.1. NCBI Genome Resources
        2. 11.3.2. TIGR Genome Resources
        3. 11.3.3. EnsEMBL
        4. 11.3.4. Other Sequencing Centers
        5. 11.3.5. Organism-Specific Resources
      4. 11.4. Annotating and Analyzing Whole Genome Sequences
        1. 11.4.1. Genome Annotation
          1. 11.4.1.1. MAGPIE
        2. 11.4.2. Genome Comparison
          1. 11.4.2.1. PipMaker
          2. 11.4.2.2. MUMmer
      5. 11.5. Functional Genomics: New Data Analysis Challenges
        1. 11.5.1. Sequence-Based Approaches for Analyzing Gene Expression
        2. 11.5.2. DNA Microarrays: Emerging Technologiesin Functional Genomics
        3. 11.5.3. Bioinformatics Challenges in Microarray Design and Analysis
          1. 11.5.3.1. Planning array experiments
          2. 11.5.3.2. Analyzing scanned microarray images with CrazyQuant
          3. 11.5.3.3. Visualizing high-dimensional data
          4. 11.5.3.4. Clustering expression profiles
          5. 11.5.3.5. A note on commercial software for expression analysis
      6. 11.6. Proteomics
        1. 11.6.1. Experimental Approaches in Proteomics
        2. 11.6.2. Informatics Challenges in 2D-PAGE Analysis
        3. 11.6.3. Tools for Proteomics Analysis
        4. 11.6.4. Generalizing the Array Approach
      7. 11.7. Biochemical Pathway Databases
        1. 11.7.1. Illustration of a Complex Metabolic Pathway
        2. 11.7.2. EC Nomenclature
        3. 11.7.3. WIT and KEGG
        4. 11.7.4. PathDB
      8. 11.8. Modeling Kinetics and Physiology
        1. 11.8.1. Modeling Kinetics with Gepasi
        2. 11.8.2. XPP
        3. 11.8.3. Using the Virtual Cell Portal
      9. 11.9. Summary
  6. IV. Databases and Visualization
    1. 12. Automating Data Analysis with Perl
      1. 12.1. Why Perl?
        1. 12.1.1. Where Do I Get Perl?
      2. 12.2. Perl Basics
        1. 12.2.1. Hello World
        2. 12.2.2. A Bioinformatics Example
        3. 12.2.3. Variables
          1. 12.2.3.1. Scalars
          2. 12.2.3.2. Arrays
          3. 12.2.3.3. Hashes
        4. 12.2.4. Loops
        5. 12.2.5. Subroutines
      3. 12.3. Pattern Matching and Regular Expressions
      4. 12.4. Parsing BLAST Output Using Perl
      5. 12.5. Applying Perl to Bioinformatics
        1. 12.5.1. Bioperl
        2. 12.5.2. CGI.pm
        3. 12.5.3. LWP
        4. 12.5.4. PDL
        5. 12.5.5. DBI
        6. 12.5.6. GD
    2. 13. Building Biological Databases
      1. 13.1. Types of Databases
        1. 13.1.1. Flat File Databases
          1. 13.1.1.1. Flat file databases in biology
        2. 13.1.2. Relational Databases
          1. 13.1.2.1. How tables are organized
          2. 13.1.2.2. The database schema
        3. 13.1.3. Object-Oriented Databases
      2. 13.2. Database Software
        1. 13.2.1. Sequence Retrieval System
        2. 13.2.2. Oracle
        3. 13.2.3. PostgreSQL
        4. 13.2.4. Open Source Object DBMS
        5. 13.2.5. MySQL
      3. 13.3. Introduction to SQL
        1. 13.3.1. SQL Datatypes
        2. 13.3.2. SQL Commands
          1. 13.3.2.1. Adding a new table to a database
          2. 13.3.2.2. Changing an existing table
          3. 13.3.2.3. Adding data to an existing table
          4. 13.3.2.4. Altering existing data in a table
        3. 13.3.3. Accessing Your Database with the SQLSELECT Command
          1. 13.3.3.1. Choosing fields to select
          2. 13.3.3.2. Using a WHERE clause to specify selection conditions
          3. 13.3.3.3. Joining output from multiple tables
      4. 13.4. Installing the MySQL DBMS
        1. 13.4.1. Setting Up MySQL
          1. 13.4.1.1. Using the mysql client program
          2. 13.4.1.2. Using the mysqladmin client program to set up MySQL
          3. 13.4.1.3. Restarting the MySQL server
        2. 13.4.2. Securing Your MySQL Server
        3. 13.4.3. Setting Up the Data Directory
        4. 13.4.4. Creating a New Database
      5. 13.5. Database Design
        1. 13.5.1. On Entities and Attributes
        2. 13.5.2. Creating a Database from Your Data Model
        3. 13.5.3. Creating Relationships Between Tables
      6. 13.6. Developing Web-Based Software That Interacts with Databases
        1. 13.6.1. CGI
        2. 13.6.2. XML
          1. 13.6.2.1. XML applications
        3. 13.6.3. PHP
          1. 13.6.3.1. Accessing MySQL databases with PHP
          2. 13.6.3.2. Collecting information from a form with PHP
    3. 14. Visualization and Data Mining
      1. 14.1. Preparing Your Data
      2. 14.2. Viewing Graphics
        1. 14.2.1. xzgv
        2. 14.2.2. Ghostview and gv
        3. 14.2.3. The GIMP
      3. 14.3. Sequence Data Visualization
        1. 14.3.1. Making Publication-Quality Alignmentswith TEXshade
        2. 14.3.2. Viewing Sequence Distances Geometrically
      4. 14.4. Networks and Pathway Visualization
      5. 14.5. Working with Numerical Data
        1. 14.5.1. gnuplot and xgfe
        2. 14.5.2. Grace: The Pocketknife of Data Visualization
        3. 14.5.3. Multidimensional Analysis: XGobi and XGvis
        4. 14.5.4. Programming for Data Analysis
          1. 14.5.4.1. R and S-plus
          2. 14.5.4.2. Online resources for R
          3. 14.5.4.3. Matlab and Octave
      6. 14.6. Visualization: Summary
      7. 14.7. Data Mining and Biological Information
        1. 14.7.1. Problems in Data Mining and Machine Learning
          1. 14.7.1.1. Supervised and unsupervised learning
        2. 14.7.2. A Collection of Data Mining Techniques
          1. 14.7.2.1. Decision trees
          2. 14.7.2.2. Neural networks
          3. 14.7.2.3. Genetic algorithms
          4. 14.7.2.4. Support vector machines
  7. Bibliography
    1. Unix
    2. SysAdmin
    3. Perl
    4. General Reference
    5. Bioinformatics Reference
    6. Molecular Biology/Biology Reference
    7. Protein Structure and Biophysics
    8. Genomics
    9. Biotechnology
    10. Databases
    11. Visualization
    12. Data Mining
  8. Colophon