Cover image for High Performance MySQL

Book description

As users come to depend on MySQL, they find that they have to deal with issues of reliability, scalability, and performance--issues that are not well documented but are critical to a smoothly functioning site. This book is an insider's guide to these little understood topics. Author Jeremy Zawodny has managed large numbers of MySQL servers for mission-critical work at Yahoo!, maintained years of contacts with the MySQL AB team, and presents regularly at conferences. Jeremy and Derek have spent months experimenting, interviewing major users of MySQL, talking to MySQL AB, benchmarking, and writing some of their own tools in order to produce the information in this book. In High Performance MySQL you will learn about MySQL indexing and optimization in depth so you can make better use of these key features. You will learn practical replication, backup, and load-balancing strategies with information that goes beyond available tools to discuss their effects in real-life environments. And you'll learn the supporting techniques you need to carry out these tasks, including advanced configuration, benchmarking, and investigating logs. Topics include:

  • A review of configuration and setup options

  • Storage engines and table types

  • Benchmarking

  • Indexes

  • Query Optimization

  • Application Design

  • Server Performance

  • Replication

  • Load-balancing

  • Backup and Recovery

  • Security

Table of Contents

  1. High Performance MySQL
    1. SPECIAL OFFER: Upgrade this ebook with O’Reilly
    2. Preface
      1. The Basic Layout of This Book
        1. Back to Basics
        2. Things to Reference as You Read the Rest of the Book
        3. Places to Tune Your Application
        4. Scaling Upward After Making Changes
        5. Make Sure All That Work Isn't for Naught
        6. The Miscellany
      2. Software Versions and Availability
      3. Conventions Used in This Book
      4. Using Code Examples
      5. How to Contact Us
      6. Acknowledgments
        1. From Jeremy
        2. From Derek
    3. 1. Back To Basics
      1. 1.1. Binary Versus Compiled-From-Source Installations
        1. 1.1.1. MySQL.com Binary Versus Distribution Binary
      2. 1.2. Configuration Files
        1. 1.2.1. File Locations
        2. 1.2.2. File Format
        3. 1.2.3. Sample Files
        4. 1.2.4. Reconfiguration
      3. 1.3. The SHOW Commands
        1. 1.3.1. SHOW VARIABLES
        2. 1.3.2. SHOW PROCESSLIST
        3. 1.3.3. SHOW STATUS
        4. 1.3.4. SHOW INNODB STATUS
    4. 2. Storage Engines (Table Types)
      1. 2.1. MySQL Architecture
      2. 2.2. Locking and Concurrency
        1. 2.2.1. Read/Write Locks
        2. 2.2.2. Lock Granularity
          1. 2.2.2.1. Table locks
          2. 2.2.2.2. Page locks
          3. 2.2.2.3. Row locks
        3. 2.2.3. Multi-Version Concurrency Control
      3. 2.3. Transactions
        1. 2.3.1. Benefits and Drawbacks
        2. 2.3.2. Isolation Levels
          1. 2.3.2.1. Read uncommitted
          2. 2.3.2.2. Read committed
          3. 2.3.2.3. Repeatable read
          4. 2.3.2.4. Serializable
        3. 2.3.3. Deadlocks
        4. 2.3.4. Transaction Logging
        5. 2.3.5. Transactions in MySQL
          1. 2.3.5.1. AUTOCOMMIT
          2. 2.3.5.2. Implicit commits
          3. 2.3.5.3. Isolation levels
          4. 2.3.5.4. Mixing storage engines in transactions
          5. 2.3.5.5. Simulating transactions
      4. 2.4. Selecting the Right Engine
        1. 2.4.1. Considerations
          1. 2.4.1.1. Transactions and concurrency
          2. 2.4.1.2. Backups
          3. 2.4.1.3. Special features
        2. 2.4.2. Practical Examples
          1. 2.4.2.1. Logging
          2. 2.4.2.2. Read-only or read-mostly tables
          3. 2.4.2.3. Order processing
          4. 2.4.2.4. Stock quotes
          5. 2.4.2.5. Bulletin boards and threaded discussion forums
          6. 2.4.2.6. CD-ROM applications
        3. 2.4.3. Table Conversions
          1. 2.4.3.1. ALTER TABLE
          2. 2.4.3.2. Dump and reimport
          3. 2.4.3.3. CREATE and SELECT
      5. 2.5. The Storage Engines
        1. 2.5.1. MyISAM Tables
          1. 2.5.1.1. Storage
          2. 2.5.1.2. Other stuff
        2. 2.5.2. Compressed MyISAM Tables
        3. 2.5.3. RAID MyISAM Tables
        4. 2.5.4. MyISAM Merge Tables
        5. 2.5.5. InnoDB Tables
          1. 2.5.5.1. Storage
          2. 2.5.5.2. Locking and concurrency
          3. 2.5.5.3. Special features
        6. 2.5.6. Heap (In-Memory) Tables
          1. 2.5.6.1. Limitations
        7. 2.5.7. Berkeley DB (BDB) Tables
    5. 3. Benchmarking
      1. 3.1. The Importance of Benchmarking
      2. 3.2. Benchmarking Strategies
      3. 3.3. Benchmarking Tools
        1. 3.3.1. The MySQL Benchmark Suite
        2. 3.3.2. MySQL super-smack
          1. 3.3.2.1. Preparing test data
          2. 3.3.2.2. Configuration
        3. 3.3.3. MyBench: A Home-Grown Solution
    6. 4. Indexes
      1. 4.1. Indexing Basics
        1. 4.1.1. Index Concepts
          1. 4.1.1.1. Partial indexes
          2. 4.1.1.2. Multicolumn indexes
          3. 4.1.1.3. Index order
          4. 4.1.1.4. Indexes as constraints
          5. 4.1.1.5. Clustered and secondary indexes
          6. 4.1.1.6. Unique indexes versus primary keys
          7. 4.1.1.7. Indexing NULLs
      2. 4.2. Index Structures
        1. 4.2.1. B-Tree Indexes
        2. 4.2.2. Hash Indexes
        3. 4.2.3. R-Tree Indexes
      3. 4.3. Indexes and Table Types
        1. 4.3.1. MyISAM Tables
          1. 4.3.1.1. Delayed key writes
        2. 4.3.2. Heap Tables
        3. 4.3.3. BDB Tables
        4. 4.3.4. InnoDB Tables
        5. 4.3.5. Full-Text Indexes
        6. 4.3.6. Index Limitations
          1. 4.3.6.1. Wildcard matches
          2. 4.3.6.2. Regular expressions
          3. 4.3.6.3. Poor statistics or corruption
          4. 4.3.6.4. Too many matching rows
      4. 4.4. Index Maintenance
        1. 4.4.1. Obtaining Index Information
        2. 4.4.2. Refreshing Index Statistics
    7. 5. Query Performance
      1. 5.1. Query Processing Basics
        1. 5.1.1. Query Cache
        2. 5.1.2. Parsing, Analysis, and Optimization
        3. 5.1.3. Using EXPLAIN
          1. 5.1.3.1. Joins
        4. 5.1.4. Execution
      2. 5.2. Optimizer Features and Oddities
        1. 5.2.1. Too Little Diversity
        2. 5.2.2. Index-Based Ordering
        3. 5.2.3. Impossible Queries
        4. 5.2.4. Full-Text Instead of LIKE
      3. 5.3. Identifying Slow Queries
      4. 5.4. Influencing MySQL with Hints
        1. 5.4.1. Join Order
        2. 5.4.2. Index Usage
        3. 5.4.3. Result Sizes
        4. 5.4.4. Query Cache
      5. 5.5. Stupid Query Tricks
        1. 5.5.1. Two Is Better Than One
        2. 5.5.2. Unions Instead of ORs
    8. 6. Server Performance Tuning
      1. 6.1. Performance-Limiting Factors
        1. 6.1.1. Disks
        2. 6.1.2. Memory
          1. 6.1.2.1. MySQL's buffers and caches
        3. 6.1.3. Network
      2. 6.2. RAID
        1. 6.2.1. Mix and Match
          1. 6.2.1.1. Sample configuration
        2. 6.2.2. Hardware Versus Software
        3. 6.2.3. IDE or SCSI?
        4. 6.2.4. RAID on Slaves
      3. 6.3. Operating System
        1. 6.3.1. Filesystems
          1. 6.3.1.1. Journaling
          2. 6.3.1.2. Other features and tweaks
          3. 6.3.1.3. Choosing a filesystem
          4. 6.3.1.4. FreeBSD
          5. 6.3.1.5. Do you need a filesystem at all?
        2. 6.3.2. Swap
        3. 6.3.3. Threading
      4. 6.4. Techniques
        1. 6.4.1. Solving I/O Bottlenecks
          1. 6.4.1.1. Wrong index
          2. 6.4.1.2. Temporary tables
          3. 6.4.1.3. Caching
          4. 6.4.1.4. Spread the load
        2. 6.4.2. Solving CPU Bottlenecks
        3. 6.4.3. Solving Memory Bottlenecks
        4. 6.4.4. Solving Kernel Bottlenecks
    9. 7. Replication
      1. 7.1. Replication Overview
        1. 7.1.1. Problems Solved with Replication
          1. 7.1.1.1. Data distribution
          2. 7.1.1.2. Load balancing
          3. 7.1.1.3. Backup and recovery
          4. 7.1.1.4. High availability and failover
        2. 7.1.2. Problems Not Solved with Replication
          1. 7.1.2.1. Real-time data transmission
          2. 7.1.2.2. Online ordering
        3. 7.1.3. Replication Performance
      2. 7.2. Configuring Replication
        1. 7.2.1. On a New Server
          1. 7.2.1.1. Account creation
          2. 7.2.1.2. Configuration file entries
          3. 7.2.1.3. Restart master
          4. 7.2.1.4. Restart slave
        2. 7.2.2. On an Existing Server
          1. 7.2.2.1. What needs to happen
          2. 7.2.2.2. Snapshot or backup, then copy
          3. 7.2.2.3. Online table copies
          4. 7.2.2.4. Online copy and synchronize (MySQL 4.x only)
      3. 7.3. Under the Hood
        1. 7.3.1. Replication in 3.23
        2. 7.3.2. Replication in 4.0
        3. 7.3.3. Files and Settings Related to Replication
          1. 7.3.3.1. Log files
          2. 7.3.3.2. Log index files
          3. 7.3.3.3. Status files
          4. 7.3.3.4. Filtering
      4. 7.4. Replication Architectures
        1. 7.4.1. The Replication Rules
        2. 7.4.2. Sample Configurations
          1. 7.4.2.1. Master with slaves
          2. 7.4.2.2. Slave with two masters
          3. 7.4.2.3. Dual master
          4. 7.4.2.4. Replication ring (multi-master)
          5. 7.4.2.5. Pyramid
          6. 7.4.2.6. Design your own
      5. 7.5. Administration and Maintenance
        1. 7.5.1. Monitoring
          1. 7.5.1.1. Master status
          2. 7.5.1.2. Slave status
          3. 7.5.1.3. Replication heartbeat
        2. 7.5.2. Log Rotation
        3. 7.5.3. Changing Masters
          1. 7.5.3.1. Using the right values
        4. 7.5.4. Tools
          1. 7.5.4.1. mysqlbinlog: Viewing data in logs
          2. 7.5.4.2. check_repl: Ensuring that replication takes place
          3. 7.5.4.3. fix_repl: Skipping a bad query to continue replication
          4. 7.5.4.4. purge_binary_logs: Reclaiming space used by binary logs
          5. 7.5.4.5. mysqldiff: Replication sanity checks
          6. 7.5.4.6. write_heartbeat: Generating a periodic health check heartbeat
          7. 7.5.4.7. read_heartbeat: Measuring replication log using heartbeat
      6. 7.6. Common Problems
        1. 7.6.1. Slave Data Changes
        2. 7.6.2. Nonunique Server IDs
        3. 7.6.3. Log Corruption or Partial Log Record
        4. 7.6.4. Bulk-Loading Data
        5. 7.6.5. Nonreplicated Dependencies
        6. 7.6.6. Missing Temporary Tables
        7. 7.6.7. Binary Log Out of Sync with Transaction Log
        8. 7.6.8. Slave Wants to Connect to the Wrong Master
      7. 7.7. The Future of Replication
        1. 7.7.1. Eliminating the Snapshot
        2. 7.7.2. Fail-Safe Replication
        3. 7.7.3. Safe Multi-Master Replication
          1. 7.7.3.1. Multipart auto-increment unique keys
          2. 7.7.3.2. Partitioned auto-increment fields
    10. 8. Load Balancing and High Availability
      1. 8.1. Load Balancing Basics
        1. 8.1.1. Differences Between MySQL and HTTP Load Balancing
          1. 8.1.1.1. Requests
          2. 8.1.1.2. Partitioning
          3. 8.1.1.3. Connection pooling
      2. 8.2. Configuration Issues
        1. 8.2.1. Health Checks
          1. 8.2.1.1. Determining health
          2. 8.2.1.2. Connection limits
        2. 8.2.2. Next-Connection Algorithms
          1. 8.2.2.1. The consequences of poor algorithm choice
      3. 8.3. Cluster Partitioning
        1. 8.3.1. Role-Based Partitioning
        2. 8.3.2. Data-Based Partitioning
        3. 8.3.3. Filtering and Multicluster Partitioning
          1. 8.3.3.1. Filtering
          2. 8.3.3.2. Separate clusters
      4. 8.4. High Availability
        1. 8.4.1. Dual-Master Replication
        2. 8.4.2. Shared Storage with Standby
        3. 8.4.3. Commercial Solutions
          1. 8.4.3.1. Veritas cluster a gent
          2. 8.4.3.2. EMIC Networks
    11. 9. Backup and Recovery
      1. 9.1. Why Backups?
        1. 9.1.1. Disaster Recovery
        2. 9.1.2. Auditing
        3. 9.1.3. Testing
      2. 9.2. Considerations and Tradeoffs
        1. 9.2.1. Dump or Raw Backup?
        2. 9.2.2. Online or Offline?
        3. 9.2.3. Table Types and Consistency
        4. 9.2.4. Storage Requirements
        5. 9.2.5. Replication
      3. 9.3. Tools and Techniques
        1. 9.3.1. mysqldump
          1. 9.3.1.1. Restoring
        2. 9.3.2. mysqlhotcopy
          1. 9.3.2.1. Restoring
        3. 9.3.3. mysqlsnapshot
          1. 9.3.3.1. Restoring
        4. 9.3.4. InnoDB Hot Backup
        5. 9.3.5. Offline Backups
          1. 9.3.5.1. Restoring
        6. 9.3.6. Filesystem Snapshots
      4. 9.4. Rolling Your Own Backup Script
    12. 10. Security
      1. 10.1. Account Basics
        1. 10.1.1. Privileges
          1. 10.1.1.1. Global privileges
      2. 10.2. The Grant Tables
        1. 10.2.1. Privilege Checks
        2. 10.2.2. The user Table
          1. 10.2.2.1. Host matching
        3. 10.2.3. The host Table
        4. 10.2.4. The db Table
        5. 10.2.5. The tables_priv Table
        6. 10.2.6. The columns_priv Table
      3. 10.3. Grant and Revoke
        1. 10.3.1. Grant Mechanics
          1. 10.3.1.1. System administrator account
          2. 10.3.1.2. Database administrator account
          3. 10.3.1.3. Average employee account
          4. 10.3.1.4. Logging, write-only access
          5. 10.3.1.5. Operations and monitoring
        2. 10.3.2. Common Problems and Limitations
          1. 10.3.2.1. Can't revoke specific privileges
          2. 10.3.2.2. Host and database matching can't exclude matches
          3. 10.3.2.3. Privileges don't vanish when objects do
      4. 10.4. Operating System Security
        1. 10.4.1. Guidelines
      5. 10.5. Network Security
        1. 10.5.1. Localhost-Only Connections
        2. 10.5.2. Firewalling
          1. 10.5.2.1. No default route
        3. 10.5.3. MySQL in a DMZ
        4. 10.5.4. Connection Encryption and Tunneling
          1. 10.5.4.1. Virtual private networks
          2. 10.5.4.2. SSL in MySQL
          3. 10.5.4.3. SSH tunneling
        5. 10.5.5. TCP Wrappers
        6. 10.5.6. Automatic Host Blocking
      6. 10.6. Data Encryption
        1. 10.6.1. Hashing Passwords
        2. 10.6.2. Encrypted Filesystems
        3. 10.6.3. Application-Level Encryption
          1. 10.6.3.1. Design issues
        4. 10.6.4. Source Code Modification
      7. 10.7. MySQL in a chrooted Environment
    13. A. The SHOW STATUS and SHOW INNODB STATUS Commands
      1. A.1. SHOW STATUS
        1. A.1.1. Thread and Connection Statistics
        2. A.1.2. Command Counters
        3. A.1.3. Temporary Files and Tables
        4. A.1.4. Data Access Patterns
        5. A.1.5. MyISAM Key Buffer
        6. A.1.6. File Descriptors
        7. A.1.7. Query Cache
        8. A.1.8. SELECTs
        9. A.1.9. Sorts
        10. A.1.10. Table Locking
      2. A.2. SHOW INNODB STATUS
    14. B. mytop
      1. B.1. Overview
        1. B.1.1. Thread View
        2. B.1.2. Command View
        3. B.1.3. Status View
      2. B.2. Getting mytop
        1. B.2.1. Requirements
        2. B.2.2. Installation
      3. B.3. Configuration and Usage
      4. B.4. Common Tasks
    15. C. phpMyAdmin
      1. C.1. The Basics
      2. C.2. Practical Examples
        1. C.2.1. User Maintenance
        2. C.2.2. Simple SQL Commands
        3. C.2.3. Exporting and Downloading Data
    16. About the Authors
    17. Colophon
    18. SPECIAL OFFER: Upgrade this ebook with O’Reilly