You are previewing High Performance MySQL, 3rd Edition.

High Performance MySQL, 3rd Edition

Cover of High Performance MySQL, 3rd Edition by Baron Schwartz... Published by O'Reilly Media, Inc.
  1. High Performance MySQL
  2. Foreword
  3. Preface
    1. How This Book Is Organized
      1. A Broad Overview
      2. Building a Solid Foundation
      3. Configuring Your Application
      4. MySQL as an Infrastructure Component
      5. Miscellaneous Useful Topics
    2. Software Versions and Availability
    3. Conventions Used in This Book
    4. Using Code Examples
    5. Safari® Books Online
    6. How to Contact Us
    7. Acknowledgments for the Third Edition
    8. Acknowledgments for the Second Edition
      1. From Baron
      2. From Peter
      3. From Vadim
      4. From Arjen
    9. Acknowledgments for the First Edition
      1. From Jeremy
      2. From Derek
  4. 1. MySQL Architecture and History
    1. MySQL’s Logical Architecture
      1. Connection Management and Security
      2. Optimization and Execution
    2. Concurrency Control
      1. Read/Write Locks
      2. Lock Granularity
    3. Transactions
      1. Isolation Levels
      2. Deadlocks
      3. Transaction Logging
      4. Transactions in MySQL
    4. Multiversion Concurrency Control
    5. MySQL’s Storage Engines
      1. The InnoDB Engine
      2. The MyISAM Engine
      3. Other Built-in MySQL Engines
      4. Third-Party Storage Engines
      5. Selecting the Right Engine
      6. Table Conversions
    6. A MySQL Timeline
    7. MySQL’s Development Model
    8. Summary
  5. 2. Benchmarking MySQL
    1. Why Benchmark?
    2. Benchmarking Strategies
      1. What to Measure
    3. Benchmarking Tactics
      1. Designing and Planning a Benchmark
      2. How Long Should the Benchmark Last?
      3. Capturing System Performance and Status
      4. Getting Accurate Results
      5. Running the Benchmark and Analyzing Results
      6. The Importance of Plotting
    4. Benchmarking Tools
      1. Full-Stack Tools
      2. Single-Component Tools
    5. Benchmarking Examples
      1. http_load
      2. MySQL Benchmark Suite
      3. sysbench
      4. dbt2 TPC-C on the Database Test Suite
      5. Percona’s TPCC-MySQL Tool
    6. Summary
  6. 3. Profiling Server Performance
    1. Introduction to Performance Optimization
      1. Optimization Through Profiling
      2. Interpreting the Profile
    2. Profiling Your Application
      1. Instrumenting PHP Applications
    3. Profiling MySQL Queries
      1. Profiling a Server’s Workload
      2. Profiling a Single Query
      3. Using the Profile for Optimization
    4. Diagnosing Intermittent Problems
      1. Single-Query Versus Server-Wide Problems
      2. Capturing Diagnostic Data
      3. A Case Study in Diagnostics
    5. Other Profiling Tools
      1. Using the USER_STATISTICS Tables
      2. Using strace
    6. Summary
  7. 4. Optimizing Schema and Data Types
    1. Choosing Optimal Data Types
      1. Whole Numbers
      2. Real Numbers
      3. String Types
      4. Date and Time Types
      5. Bit-Packed Data Types
      6. Choosing Identifiers
      7. Special Types of Data
    2. Schema Design Gotchas in MySQL
    3. Normalization and Denormalization
      1. Pros and Cons of a Normalized Schema
      2. Pros and Cons of a Denormalized Schema
      3. A Mixture of Normalized and Denormalized
    4. Cache and Summary Tables
      1. Materialized Views
      2. Counter Tables
    5. Speeding Up ALTER TABLE
      1. Modifying Only the .frm File
      2. Building MyISAM Indexes Quickly
    6. Summary
  8. 5. Indexing for High Performance
    1. Indexing Basics
      1. Types of Indexes
    2. Benefits of Indexes
    3. Indexing Strategies for High Performance
      1. Isolating the Column
      2. Prefix Indexes and Index Selectivity
      3. Multicolumn Indexes
      4. Choosing a Good Column Order
      5. Clustered Indexes
      6. Covering Indexes
      7. Using Index Scans for Sorts
      8. Packed (Prefix-Compressed) Indexes
      9. Redundant and Duplicate Indexes
      10. Unused Indexes
      11. Indexes and Locking
    4. An Indexing Case Study
      1. Supporting Many Kinds of Filtering
      2. Avoiding Multiple Range Conditions
      3. Optimizing Sorts
    5. Index and Table Maintenance
      1. Finding and Repairing Table Corruption
      2. Updating Index Statistics
      3. Reducing Index and Data Fragmentation
    6. Summary
  9. 6. Query Performance Optimization
    1. Why Are Queries Slow?
    2. Slow Query Basics: Optimize Data Access
      1. Are You Asking the Database for Data You Don’t Need?
      2. Is MySQL Examining Too Much Data?
    3. Ways to Restructure Queries
      1. Complex Queries Versus Many Queries
      2. Chopping Up a Query
      3. Join Decomposition
    4. Query Execution Basics
      1. The MySQL Client/Server Protocol
      2. The Query Cache
      3. The Query Optimization Process
      4. The Query Execution Engine
      5. Returning Results to the Client
    5. Limitations of the MySQL Query Optimizer
      1. Correlated Subqueries
      2. UNION Limitations
      3. Index Merge Optimizations
      4. Equality Propagation
      5. Parallel Execution
      6. Hash Joins
      7. Loose Index Scans
      8. MIN() and MAX()
      9. SELECT and UPDATE on the Same Table
    6. Query Optimizer Hints
    7. Optimizing Specific Types of Queries
      1. Optimizing COUNT() Queries
      2. Optimizing JOIN Queries
      3. Optimizing Subqueries
      4. Optimizing GROUP BY and DISTINCT
      5. Optimizing LIMIT and OFFSET
      6. Optimizing SQL_CALC_FOUND_ROWS
      7. Optimizing UNION
      8. Static Query Analysis
      9. Using User-Defined Variables
    8. Case Studies
      1. Building a Queue Table in MySQL
      2. Computing the Distance Between Points
      3. Using User-Defined Functions
    9. Summary
  10. 7. Advanced MySQL Features
    1. Partitioned Tables
      1. How Partitioning Works
      2. Types of Partitioning
      3. How to Use Partitioning
      4. What Can Go Wrong
      5. Optimizing Queries
      6. Merge Tables
    2. Views
      1. Updatable Views
      2. Performance Implications of Views
      3. Limitations of Views
    3. Foreign Key Constraints
    4. Storing Code Inside MySQL
      1. Stored Procedures and Functions
      2. Triggers
      3. Events
      4. Preserving Comments in Stored Code
    5. Cursors
    6. Prepared Statements
      1. Prepared Statement Optimization
      2. The SQL Interface to Prepared Statements
      3. Limitations of Prepared Statements
    7. User-Defined Functions
    8. Plugins
    9. Character Sets and Collations
      1. How MySQL Uses Character Sets
      2. Choosing a Character Set and Collation
      3. How Character Sets and Collations Affect Queries
    10. Full-Text Searching
      1. Natural-Language Full-Text Searches
      2. Boolean Full-Text Searches
      3. Full-Text Changes in MySQL 5.1
      4. Full-Text Tradeoffs and Workarounds
      5. Full-Text Configuration and Optimization
    11. Distributed (XA) Transactions
      1. Internal XA Transactions
      2. External XA Transactions
    12. The MySQL Query Cache
      1. How MySQL Checks for a Cache Hit
      2. How the Cache Uses Memory
      3. When the Query Cache Is Helpful
      4. How to Configure and Maintain the Query Cache
      5. InnoDB and the Query Cache
      6. General Query Cache Optimizations
      7. Alternatives to the Query Cache
    13. Summary
  11. 8. Optimizing Server Settings
    1. How MySQL’s Configuration Works
      1. Syntax, Scope, and Dynamism
      2. Side Effects of Setting Variables
      3. Getting Started
      4. Iterative Optimization by Benchmarking
    2. What Not to Do
    3. Creating a MySQL Configuration File
      1. Inspecting MySQL Server Status Variables
    4. Configuring Memory Usage
      1. How Much Memory Can MySQL Use?
      2. Per-Connection Memory Needs
      3. Reserving Memory for the Operating System
      4. Allocating Memory for Caches
      5. The InnoDB Buffer Pool
      6. The MyISAM Key Caches
      7. The Thread Cache
      8. The Table Cache
      9. The InnoDB Data Dictionary
    5. Configuring MySQL’s I/O Behavior
      1. InnoDB I/O Configuration
      2. MyISAM I/O Configuration
    6. Configuring MySQL Concurrency
      1. InnoDB Concurrency Configuration
      2. MyISAM Concurrency Configuration
    7. Workload-Based Configuration
      1. Optimizing for BLOB and TEXT Workloads
      2. Optimizing for Filesorts
    8. Completing the Basic Configuration
    9. Safety and Sanity Settings
    10. Advanced InnoDB Settings
    11. Summary
  12. 9. Operating System and Hardware Optimization
    1. What Limits MySQL’s Performance?
    2. How to Select CPUs for MySQL
      1. Which Is Better: Fast CPUs or Many CPUs?
      2. CPU Architecture
      3. Scaling to Many CPUs and Cores
    3. Balancing Memory and Disk Resources
      1. Random Versus Sequential I/O
      2. Caching, Reads, and Writes
      3. What’s Your Working Set?
      4. Finding an Effective Memory-to-Disk Ratio
      5. Choosing Hard Disks
    4. Solid-State Storage
      1. An Overview of Flash Memory
      2. Flash Technologies
      3. Benchmarking Flash Storage
      4. Solid-State Drives (SSDs)
      5. PCIe Storage Devices
      6. Other Types of Solid-State Storage
      7. When Should You Use Flash?
      8. Using Flashcache
      9. Optimizing MySQL for Solid-State Storage
    5. Choosing Hardware for a Replica
    6. RAID Performance Optimization
      1. RAID Failure, Recovery, and Monitoring
      2. Balancing Hardware RAID and Software RAID
      3. RAID Configuration and Caching
    7. Storage Area Networks and Network-Attached Storage
      1. SAN Benchmarks
      2. Using a SAN over NFS or SMB
      3. MySQL Performance on a SAN
      4. Should You Use a SAN?
    8. Using Multiple Disk Volumes
    9. Network Configuration
    10. Choosing an Operating System
    11. Choosing a Filesystem
    12. Choosing a Disk Queue Scheduler
    13. Threading
    14. Swapping
    15. Operating System Status
      1. How to Read vmstat Output
      2. How to Read iostat Output
      3. Other Helpful Tools
      4. A CPU-Bound Machine
      5. An I/O-Bound Machine
      6. A Swapping Machine
      7. An Idle Machine
    16. Summary
  13. 10. Replication
    1. Replication Overview
      1. Problems Solved by Replication
      2. How Replication Works
    2. Setting Up Replication
      1. Creating Replication Accounts
      2. Configuring the Master and Replica
      3. Starting the Replica
      4. Initializing a Replica from Another Server
      5. Recommended Replication Configuration
    3. Replication Under the Hood
      1. Statement-Based Replication
      2. Row-Based Replication
      3. Statement-Based or Row-Based: Which Is Better?
      4. Replication Files
      5. Sending Replication Events to Other Replicas
      6. Replication Filters
    4. Replication Topologies
      1. Master and Multiple Replicas
      2. Master-Master in Active-Active Mode
      3. Master-Master in Active-Passive Mode
      4. Master-Master with Replicas
      5. Ring Replication
      6. Master, Distribution Master, and Replicas
      7. Tree or Pyramid
      8. Custom Replication Solutions
    5. Replication and Capacity Planning
      1. Why Replication Doesn’t Help Scale Writes
      2. When Will Replicas Begin to Lag?
      3. Plan to Underutilize
    6. Replication Administration and Maintenance
      1. Monitoring Replication
      2. Measuring Replication Lag
      3. Determining Whether Replicas Are Consistent with the Master
      4. Resyncing a Replica from the Master
      5. Changing Masters
      6. Switching Roles in a Master-Master Configuration
    7. Replication Problems and Solutions
      1. Errors Caused by Data Corruption or Loss
      2. Using Nontransactional Tables
      3. Mixing Transactional and Nontransactional Tables
      4. Nondeterministic Statements
      5. Different Storage Engines on the Master and Replica
      6. Data Changes on the Replica
      7. Nonunique Server IDs
      8. Undefined Server IDs
      9. Dependencies on Nonreplicated Data
      10. Missing Temporary Tables
      11. Not Replicating All Updates
      12. Lock Contention Caused by InnoDB Locking Selects
      13. Writing to Both Masters in Master-Master Replication
      14. Excessive Replication Lag
      15. Oversized Packets from the Master
      16. Limited Replication Bandwidth
      17. No Disk Space
      18. Replication Limitations
    8. How Fast Is Replication?
    9. Advanced Features in MySQL Replication
    10. Other Replication Technologies
    11. Summary
  14. 11. Scaling MySQL
    1. What Is Scalability?
      1. A Formal Definition
    2. Scaling MySQL
      1. Planning for Scalability
      2. Buying Time Before Scaling
      3. Scaling Up
      4. Scaling Out
      5. Scaling by Consolidation
      6. Scaling by Clustering
      7. Scaling Back
    3. Load Balancing
      1. Connecting Directly
      2. Introducing a Middleman
      3. Load Balancing with a Master and Multiple Replicas
    4. Summary
  15. 12. High Availability
    1. What Is High Availability?
    2. What Causes Downtime?
    3. Achieving High Availability
      1. Improving Mean Time Between Failures
      2. Improving Mean Time to Recovery
    4. Avoiding Single Points of Failure
      1. Shared Storage or Replicated Disk
      2. Synchronous MySQL Replication
      3. Replication-Based Redundancy
    5. Failover and Failback
      1. Promoting a Replica or Switching Roles
      2. Virtual IP Addresses or IP Takeover
      3. Middleman Solutions
      4. Handling Failover in the Application
    6. Summary
  16. 13. MySQL in the Cloud
    1. Benefits, Drawbacks, and Myths of the Cloud
    2. The Economics of MySQL in the Cloud
    3. MySQL Scaling and HA in the Cloud
    4. The Four Fundamental Resources
    5. MySQL Performance in Cloud Hosting
      1. Benchmarks for MySQL in the Cloud
    6. MySQL Database as a Service (DBaaS)
      1. Amazon RDS
      2. Other DBaaS Solutions
    7. Summary
  17. 14. Application-Level Optimization
    1. Common Problems
    2. Web Server Issues
      1. Finding the Optimal Concurrency
    3. Caching
      1. Caching Below the Application
      2. Application-Level Caching
      3. Cache Control Policies
      4. Cache Object Hierarchies
      5. Pregenerating Content
      6. The Cache as an Infrastructure Component
      7. Using HandlerSocket and memcached Access
    4. Extending MySQL
    5. Alternatives to MySQL
    6. Summary
  18. 15. Backup and Recovery
    1. Why Backups?
    2. Defining Recovery Requirements
    3. Designing a MySQL Backup Solution
      1. Online or Offline Backups?
      2. Logical or Raw Backups?
      3. What to Back Up
      4. Storage Engines and Consistency
      5. Replication
    4. Managing and Backing Up Binary Logs
      1. The Binary Log Format
      2. Purging Old Binary Logs Safely
    5. Backing Up Data
      1. Making a Logical Backup
      2. Filesystem Snapshots
    6. Recovering from a Backup
      1. Restoring Raw Files
      2. Restoring Logical Backups
      3. Point-in-Time Recovery
      4. More Advanced Recovery Techniques
      5. InnoDB Crash Recovery
    7. Backup and Recovery Tools
      1. MySQL Enterprise Backup
      2. Percona XtraBackup
      3. mylvmbackup
      4. Zmanda Recovery Manager
      5. mydumper
      6. mysqldump
    8. Scripting Backups
    9. Summary
  19. 16. Tools for MySQL Users
    1. Interface Tools
    2. Command-Line Utilities
    3. SQL Utilities
    4. Monitoring Tools
      1. Open Source Monitoring Tools
      2. Commercial Monitoring Systems
      3. Command-Line Monitoring with Innotop
    5. Summary
  20. A. Forks and Variants of MySQL
    1. Percona Server
    2. MariaDB
    3. Drizzle
    4. Other MySQL Variants
    5. Summary
  21. B. MySQL Server Status
    1. System Variables
    2. SHOW STATUS
      1. Thread and Connection Statistics
      2. Binary Logging Status
      3. Command Counters
      4. Temporary Files and Tables
      5. Handler Operations
      6. MyISAM Key Buffer
      7. File Descriptors
      8. Query Cache
      9. SELECT Types
      10. Sorts
      11. Table Locking
      12. InnoDB-Specific
      13. Plugin-Specific
    3. SHOW ENGINE INNODB STATUS
      1. Header
      2. SEMAPHORES
      3. LATEST FOREIGN KEY ERROR
      4. LATEST DETECTED DEADLOCK
      5. TRANSACTIONS
      6. FILE I/O
      7. INSERT BUFFER AND ADAPTIVE HASH INDEX
      8. LOG
      9. BUFFER POOL AND MEMORY
      10. ROW OPERATIONS
    4. SHOW PROCESSLIST
    5. SHOW ENGINE INNODB MUTEX
    6. Replication Status
    7. The INFORMATION_SCHEMA
      1. InnoDB Tables
      2. Tables in Percona Server
    8. The Performance Schema
    9. Summary
  22. C. Transferring Large Files
    1. Copying Files
      1. A Naïve Example
      2. A One-Step Method
      3. Avoiding Encryption Overhead
      4. Other Options
    2. File Copy Benchmarks
  23. D. Using EXPLAIN
    1. Invoking EXPLAIN
      1. Rewriting Non-SELECT Queries
    2. The Columns in EXPLAIN
      1. The id Column
      2. The select_type Column
      3. The table Column
      4. The type Column
      5. The possible_keys Column
      6. The key Column
      7. The key_len Column
      8. The ref Column
      9. The rows Column
      10. The filtered Column
      11. The Extra Column
    3. Tree-Formatted Output
    4. Improvements in MySQL 5.6
  24. E. Debugging Locks
    1. Lock Waits at the Server Level
      1. Table Locks
      2. The Global Read Lock
      3. Name Locks
      4. User Locks
    2. Lock Waits in InnoDB
      1. Using the INFORMATION_SCHEMA Tables
  25. F. Using Sphinx with MySQL
    1. A Typical Sphinx Search
    2. Why Use Sphinx?
      1. Efficient and Scalable Full-Text Searching
      2. Applying WHERE Clauses Efficiently
      3. Finding the Top Results in Order
      4. Optimizing GROUP BY Queries
      5. Generating Parallel Result Sets
      6. Scaling
      7. Aggregating Sharded Data
    3. Architectural Overview
      1. Installation Overview
      2. Typical Partition Use
    4. Special Features
      1. Phrase Proximity Ranking
      2. Support for Attributes
      3. Filtering
      4. The SphinxSE Pluggable Storage Engine
      5. Advanced Performance Control
    5. Practical Implementation Examples
      1. Full-Text Searching on Mininova.org
      2. Full-Text Searching on BoardReader.com
      3. Optimizing Selects on Sahibinden.com
      4. Optimizing GROUP BY on BoardReader.com
      5. Optimizing Sharded JOIN Queries on Grouply.com
    6. Summary
  26. Index
  27. About the Authors
  28. Colophon
  29. Copyright
O'Reilly logo

Chapter 9. Operating System and Hardware Optimization

Your MySQL server can perform only as well as its weakest link, and the operating system and the hardware on which it runs are often limiting factors. The disk size, the available memory and CPU resources, the network, and the components that link them all limit the system’s ultimate capacity. Thus, you need to choose your hardware carefully, and configure the hardware and operating system appropriately. For example, if your workload is I/O-bound, one approach is to design your application to minimize MySQL’s I/O workload. However, it’s often smarter to upgrade the I/O subsystem, install more memory, or reconfigure existing disks.

Hardware changes very rapidly, so anything we write about particular products or components in this chapter will become outdated quickly. As usual, our goal is to help improve your understanding so that you can apply your knowledge in situations we don’t cover directly. However, we will use currently available hardware to illustrate our points.

What Limits MySQL’s Performance?

Many different hardware components can affect MySQL’s performance, but the two most frequent bottlenecks we see are CPU and I/O saturation. CPU saturation happens when MySQL works with data that either fits in memory or can be read from disk as fast as needed. A lot of datasets fit completely in memory with the large amounts of RAM available these days.

I/O saturation, on the other hand, generally happens when you need to work with ...

The best content for your career. Discover unlimited learning on demand for around $1/day.