You are previewing Optimizing Linux® Performance: A Hands-On Guide to Linux® Performance Tools.
O'Reilly logo
Optimizing Linux® Performance: A Hands-On Guide to Linux® Performance Tools

Book Description

  • The first comprehensive, expert guide for end-to-end Linux application optimization

  • Learn to choose the right tools—and use them together to solve real problems in real production environments

Superior application performance is more crucial than ever—and in today's complex production environments, it's tougher to ensure, too. If you use Linux, you have extraordinary advantages: complete source code access, plus an exceptional array of optimization tools. But the tools are scattered across the Internet. Many are poorly documented. And few experts know how to use them together to solve real problems. Now, one of those experts has written the definitive Linux tuning primer for every professional: Optimizing Linux® Performance.

Renowned Linux benchmarking specialist Phillip Ezolt introduces each of today's most important Linux optimization tools, showing how they fit into a proven methodology for perfecting overall application performance. Using realistic examples, Ezolt shows developers how to pinpoint exact lines of source code that are impacting performance. He teaches sysadmins and application developers how to rapidly drill down to specific bottlenecks, so they can implement solutions more quickly. You'll discover how to:

  • Identify bottlenecks even if you're not familiar with the underlying system

  • Find and choose the right performance tools for any problem

  • Recognize the meaning of the events you're measuring

  • Optimize system CPU, user CPU, memory, network I/O, and disk I/O—and understand their interrelationships

  • Fix CPU-bound, latency-sensitive, and I/O-bound applications, through case studies you can easily adapt to your own environment

Install and use oprofile, the advanced systemwide profiler for Linux systems

If you're new to tuning, Ezolt gives you a clear and practical introduction to all the principles and strategies you'll need. If you're migrating to Linux, you'll quickly master Linux equivalents to the tools and techniques you already know. Whatever your background or environment, this book can help you improve the performance of all your Linux applications—increasingbusiness value and user satisfaction at the same time.

© Copyright Pearson Education. All rights reserved.

Table of Contents

  1. Copyright
    1. Dedication
  2. Hewlett-Packard® Professional Books
  3. Preface
    1. Why Is Performance Important?
    2. Linux: Strengths and Weakness
    3. How Can This Book Help You?
    4. Why Learn How to Use Performance Tools?
    5. Can I Tune for Performance?
    6. Who Should Read This Book?
    7. How Is This Book Organized?
  4. Acknowledgments
  5. About the Author
  6. 1. Performance Hunting Tips
    1. 1.1. General Tips
      1. 1.1.1. Take Copious Notes (Save Everything)
      2. 1.1.2. Automate Redundant Tasks
      3. 1.1.3. Choose Low-Overhead Tools If Possible
      4. 1.1.4. Use Multiple Tools to Understand the Problem
      5. 1.1.5. Trust Your Tools
      6. 1.1.6. Use the Experience of Others (Cautiously)
    2. 1.2. Outline of a Performance Investigation
      1. 1.2.1. Finding a Metric, Baseline, and Target
        1. 1.2.1.1. Establish a Metric
        2. 1.2.1.2. Establish a Baseline
        3. 1.2.1.3. Establish a Target
      2. 1.2.2. Track Down the Approximate Problem
      3. 1.2.3. See Whether the Problem Has Already Been Solved
      4. 1.2.4. The Case Begins (Start to Investigate)
      5. 1.2.5. Document, Document, Document
    3. 1.3. Chapter Summary
  7. 2. Performance Tools: System CPU
    1. 2.1. CPU Performance Statistics
      1. 2.1.1. Run Queue Statistics
      2. 2.1.2. Context Switches
      3. 2.1.3. Interrupts
      4. 2.1.4. CPU Utilization
    2. 2.2. Linux Performance Tools: CPU
      1. 2.2.1. vmstat (Virtual Memory Statistics)
        1. 2.2.1.1. CPU Performance-Related Options
        2. 2.2.1.2. Example Usage
      2. 2.2.2. top (v. 2.0.x)
        1. 2.2.2.1. CPU Performance-Related Options
        2. 2.2.2.2. Example Usage
      3. 2.2.3. top (v. 3.x.x)
        1. 2.2.3.1. CPU Performance-Related Options
        2. 2.2.3.2. Example Usage
      4. 2.2.4. procinfo (Display Info from the /proc File System)
        1. 2.2.4.1. CPU Performance-Related Options
        2. 2.2.4.2. Example Usage
      5. 2.2.5. gnome-system-monitor
        1. 2.2.5.1. CPU Performance-Related Options
        2. 2.2.5.2. Example Usage
      6. 2.2.6. mpstat (Multiprocessor Stat)
        1. 2.2.6.1. CPU Performance-Related Options
        2. 2.2.6.2. Example Usage
      7. 2.2.7. sar (System Activity Reporter)
        1. 2.2.7.1. CPU Performance-Related Options
        2. 2.2.7.2. Example Usage
      8. 2.2.8. oprofile
        1. 2.2.8.1. CPU Performance-Related Options
        2. 2.2.8.2. Example Usage
    3. 2.3. Chapter Summary
  8. 3. Performance Tools: System Memory
    1. 3.1. Memory Performance Statistics
      1. 3.1.1. Memory Subsystem and Performance
      2. 3.1.2. Memory Subsystem (Virtual Memory)
        1. 3.1.2.1. Swap (Not Enough Physical Memory)
        2. 3.1.2.2. Buffers and Cache (Too Much Physical Memory)
        3. 3.1.2.3. Active Versus Inactive Memory
        4. 3.1.2.4. High Versus Low Memory
        5. 3.1.2.5. Kernel Usage of Memory (Slabs)
    2. 3.2. Linux Performance Tools: CPU and Memory
      1. 3.2.1. vmstat (Virtual Memory Statistics) II
        1. 3.2.1.1. System-Wide Memory-Related Options
        2. 3.2.1.2. Example Usage
      2. 3.2.2. top (2.x and 3.x)
        1. 3.2.2.1. Memory Performance-Related Options
        2. 3.2.2.2. Example Usage
      3. 3.2.3. procinfo II
        1. 3.2.3.1. Memory Performance-Related Options
        2. 3.2.3.2. Example Usage
      4. 3.2.4. gnome-system-monitor (II)
        1. 3.2.4.1. Memory Performance-Related Options
        2. 3.2.4.2. Example Usage
      5. 3.2.5. free
        1. 3.2.5.1. Memory Performance-Related Options
        2. 3.2.5.2. Example Usage
      6. 3.2.6. slabtop
        1. 3.2.6.1. Memory Performance-Related Options
        2. 3.2.6.2. Example Usage
      7. 3.2.7. sar (II)
        1. 3.2.7.1. Memory Performance-Related Options
        2. 3.2.7.2. Example Usage
      8. 3.2.8. /proc/meminfo
        1. 3.2.8.1. Memory Performance-Related Options
        2. 3.8.2.2. Example Usage
    3. 3.3. Chapter Summary
  9. 4. Performance Tools: Process-Specific CPU
    1. 4.1. Process Performance Statistics
      1. 4.1.1. Kernel Time Versus User Time
      2. 4.1.2. Library Time Versus Application Time
      3. 4.1.3. Subdividing Application Time
    2. 4.2. The Tools
      1. 4.2.1. time
        1. 4.2.1.1. CPU Performance-Related Options
        2. 4.2.1.2. Example Usage
      2. 4.2.2. strace
        1. 4.2.2.1. CPU Performance-Related Options
        2. 4.2.2.2. Example Usage
      3. 4.2.3. ltrace
        1. 4.2.3.1. CPU Performance-Related Options
        2. 4.2.3.2. Example Usage
      4. 4.2.4. ps (Process Status)
        1. 4.2.4.1. CPU Performance-Related Options
        2. 4.2.4.2. Example Usage
      5. 4.2.5. ld.so (Dynamic Loader)
        1. 4.2.5.1. CPU Performance-Related Options
        2. 4.2.5.2. Example Usage
      6. 4.2.6. gprof
        1. 4.2.6.1. CPU Performance-Related Options
        2. 4.2.6.2. Example Usage
      7. 4.2.7. oprofile (II)
        1. 4.2.7.1. CPU Performance-Related Options
        2. 4.2.7.2. Example Usage
      8. 4.2.8. Languages: Static (C and C++) Versus Dynamic (Java and Mono)
    3. 4.3. Chapter Summary
  10. 5. Performance Tools: Process-Specific Memory
    1. 5.1. Linux Memory Subsystem
    2. 5.2. Memory Performance Tools
      1. 5.2.1. ps
        1. 5.2.1.1. Memory Performance-Related Options
        2. 5.2.1.2. Example Usage
      2. 5.2.2. /proc/<PID>
        1. 5.2.2.1. Memory Performance-Related Options
        2. 5.2.2.2. Example Usage
      3. 5.2.3. memprof
        1. 5.2.3.1. Memory Performance-Related Options
        2. 5.2.3.2. Example Usage
      4. 5.2.4. valgrind (cachegrind)
        1. 5.2.4.1. Memory Performance-Related Options
        2. 5.2.4.2. Example Usage
      5. 5.2.5. kcachegrind
        1. 5.2.5.1. Memory Performance-Related Options
        2. 5.2.5.2. Example Usage
      6. 5.2.6. oprofile (III)
        1. 5.2.6.1. Memory Performance-Related Options
        2. 5.2.6.2. Example Usage
      7. 5.2.7. ipcs
        1. 5.2.7.1. Memory Performance-Related Options
        2. 5.2.7.2. Example Usage
      8. 5.2.8. Dynamic Languages (Java, Mono)
    3. 5.3. Chapter Summary
  11. 6. Performance Tools: Disk I/O
    1. 6.1. Introduction to Disk I/O
    2. 6.2. Disk I/O Performance Tools
      1. 6.2.1. vmstat (ii)
        1. 6.2.1.1. Disk I/O Performance-Related Options and Outputs
        2. 6.2.1.2. Example Usage
      2. 6.2.2. iostat
        1. 6.2.2.1. Disk I/O Performance-Related Options and Outputs
        2. 6.2.2.2. Example Usage
      3. 6.2.3. sar
        1. 6.2.3.1. Disk I/O Performance-Related Options and Outputs
        2. 6.2.3.2. Example Usage
      4. 6.2.4. lsof (List Open Files)
        1. 6.2.4.1. Disk I/O Performance-Related Options and Outputs
        2. 6.2.4.2. Example Usage
    3. 6.3. What’s Missing?
    4. 6.4. Chapter Summary
  12. 7. Performance Tools: Network
    1. 7.1. Introduction to Network I/O
      1. 7.1.1. Network Traffic in the Link Layer
      2. 7.1.2. Protocol-Level Network Traffic
    2. 7.2. Network Performance Tools
      1. 7.2.1. mii-tool (Media-Independent Interface Tool)
        1. 7.2.1.1. Network I/O Performance-Related Options
        2. 7.2.1.2. Example Usage
      2. 7.2.2. ethtool
        1. 7.2.2.1. Network I/O Performance-Related Options
        2. 7.2.2.2. Example Usage
      3. 7.2.3. ifconfig (Interface Configure)
        1. 7.2.3.1. Network I/O Performance-Related Options
        2. 7.2.3.2. Example Usage
      4. 7.2.4. ip
        1. 7.2.4.1. Network I/O Performance-Related Options
        2. 7.2.4.2. Example Usage
      5. 7.2.5. sar
        1. 7.2.5.1. Network I/O Performance-Related Options
        2. 7.2.5.2. Example Usage
      6. 7.2.6. gkrellm
        1. 7.2.6.1. Network I/O Performance-Related Options
        2. 7.2.6.2. Example Usage
      7. 7.2.7. iptraf
        1. 7.2.7.1. Network I/O Performance-Related Options
        2. 7.2.7.2. Example Usage
      8. 7.2.8. netstat
        1. 7.2.8.1. Network I/O Performance-Related Options
        2. 7.2.8.2. Example Usage
      9. 7.2.9. etherape
        1. 7.2.9.1. Network I/O Performance-Related Options
        2. 7.2.9.2. Example Usage
    3. 7.3. Chapter Summary
  13. 8. Utility Tools: Performance Tool Helpers
    1. 8.1. Performance Tool Helpers
      1. 8.1.1. Automating and Recording Commands
      2. 8.1.2. Graphing and Analyzing Performance Statistics
      3. 8.1.3. Investigating the Libraries That an Application Uses
      4. 8.1.4. Creating and Debugging Applications
    2. 8.2. Tools
      1. 8.2.1. bash
        1. 8.2.1.1. Performance-Related Options
        2. 8.2.1.2. Example Usage
      2. 8.2.2. tee
        1. 8.2.2.1. Performance-Related Options
        2. 8.2.2.2. Example Usage
      3. 8.2.3. script
        1. 8.2.3.1. Performance-Related Options
        2. 8.2.3.2. Example Usage
      4. 8.2.4. watch
        1. 8.2.4.1. Performance-Related Options
        2. 8.2.4.2. Example Usage
      5. 8.2.5. gnumeric
        1. 8.2.5.1. Performance-Related Options
        2. 8.2.5.2. Example Usage
      6. 8.2.6. ldd
        1. 8.2.6.1. Performance-Related Options
        2. 8.2.6.2. Example Usage
      7. 8.2.7. objdump
        1. 8.2.7.1. Performance-Related Options
        2. 8.2.7.2. Example Usage
      8. 8.2.8. GNU Debugger (gdb)
        1. 8.2.8.1. Performance-Related Options
        2. 8.2.8.2. Example Usage
      9. 8.2.9. gcc (GNU Compiler Collection)
        1. 8.2.9.1. Performance-Related Options
        2. 8.2.9.2. Example Usage
    3. 8.3. Chapter Summary
  14. 9. Using Performance Tools to Find Problems
    1. 9.1. Not Always a Silver Bullet
    2. 9.2. Starting the Hunt
    3. 9.3. Optimizing an Application
      1. 9.3.1. Is Memory Usage a Problem?
      2. 9.3.2. Is Startup Time a Problem?
      3. 9.3.3. Is the Loader Introducing a Delay?
      4. 9.3.4. Is CPU Usage (or Length of Time to Complete) a Problem?
      5. 9.3.5. Is the Application’s Disk Usage a Problem?
      6. 9.3.6. Is the Application’s Network Usage a Problem?
    4. 9.4. Optimizing a System
      1. 9.4.1. Is the System CPU-Bound?
      2. 9.4.2. Is a Single Processor CPU-Bound?
      3. 9.4.3. Are One or More Processes Using Most of the System CPU?
      4. 9.4.4. Are One or More Processes Using Most of an Individual CPU?
      5. 9.4.5. Is the Kernel Servicing Many Interrupts?
      6. 9.4.6. Where Is Time Spent in the Kernel?
      7. 9.4.7. Is the Amount of Swap Space Being Used Increasing?
      8. 9.4.8. Is the System I/O-Bound?
      9. 9.4.9. Is the System Using Disk I/O?
      10. 9.4.10. Is the System Using Network I/O?
    5. 9.5. Optimizing Process CPU Usage
      1. 9.5.1. Is the Process Spending Time in User or Kernel Space?
      2. 9.5.2. Which System Calls Is the Process Making, and How Long Do They Take to Complete?
      3. 9.5.3. In Which Functions Does the Process Spend Time?
      4. 9.5.4. What Is the Call Tree to the Hot Functions?
      5. 9.5.5. Do Cache Misses Correspond to the Hot Functions or Source Lines?
    6. 9.6. Optimizing Memory Usage
      1. 9.6.1. Is the Kernel Memory Usage Increasing?
      2. 9.6.2. What Type of Memory Is the Kernel Using?
      3. 9.6.3. Is a Particular Process’s Resident Set Size Increasing?
      4. 9.6.4. Is Shared Memory Usage Increasing?
      5. 9.6.5. Which Processes Are Using the Shared Memory?
      6. 9.6.6. What Type of Memory Is the Process Using?
      7. 9.6.7. What Functions Are Using All of the Stack?
      8. 9.6.8. What Functions Have the Biggest Text Size?
      9. 9.6.9. How Big Are the Libraries That the Process Uses?
      10. 9.6.10. What Functions Are Allocating Heap Memory?
    7. 9.7. Optimizing Disk I/O Usage
      1. 9.7.1. Is the System Stressing a Particular Disk?
      2. 9.7.2. Which Application Is Accessing the Disk?
      3. 9.7.3. Which Files Are Accessed by the Application?
    8. 9.8. Optimizing Network I/O Usage
      1. 9.8.1. Is Any Network Device Sending/Receiving Near the Theoretical Limit?
      2. 9.8.2. Is Any Network Device Generating a Large Number of Errors?
      3. 9.8.3. What Type of Traffic Is Running on That Device?
      4. 9.8.4. Is a Particular Process Responsible for That Traffic?
      5. 9.8.5. What Remote System Is Sending the Traffic?
      6. 9.8.6. Which Application Socket Is Responsible for the Traffic?
    9. 9.9. The End
    10. 9.10. Chapter Summary
  15. 10. Performance Hunt 1: A CPU-Bound Application (GIMP)
    1. 10.1. CPU-Bound Application
    2. 10.2. Identify a Problem
    3. 10.3. Find a Baseline/Set a Goal
    4. 10.4. Configure the Application for the Performance Hunt
    5. 10.5. Install and Configure Performance Tools
    6. 10.6. Run Application and Performance Tools
    7. 10.7. Analyze the Results
    8. 10.8. Jump to the Web
    9. 10.9. Increase the Image Cache
    10. 10.10. Hitting a (Tiled) Wall
    11. 10.11. Solving the Problem
    12. 10.12. Verify Correctness?
    13. 10.13. Next Steps
    14. 10.14. Chapter Summary
  16. 11. Performance Hunt 2: A Latency-Sensitive Application (nautilus)
    1. 11.1. A Latency-Sensitive Application
    2. 11.2. Identify a Problem
    3. 11.3. Find a Baseline/Set a Goal
    4. 11.4. Configure the Application for the Performance Hunt
    5. 11.5. Install and Configure Performance Tools
    6. 11.6. Run Application and Performance Tools
    7. 11.7. Compile and Examine the Source
    8. 11.8. Using gdb to Generate Call Traces
    9. 11.9. Finding the Time Differences
    10. 11.10. Trying a Possible Solution
    11. 11.11. Chapter Summary
  17. 12. Performance Hunt 3: The System-Wide Slowdown (prelink)
    1. 12.1. Investigating a System-Wide Slowdown
    2. 12.2. Identify a Problem
    3. 12.3. Find a Baseline/Set a Goal
    4. 12.4. Configure the Application for the Performance Hunt
    5. 12.5. Install and Configure Performance Tools
    6. 12.6. Run Application and Performance Tools
    7. 12.7. Simulating a Solution
    8. 12.8. Reporting the Problem
    9. 12.9. Testing the Solution
    10. 12.10. Chapter Summary
  18. 13. Performance Tools: What’s Next?
    1. 13.1. The State of Linux Tools
    2. 13.2. What Tools Does Linux Still Need?
      1. 13.2.1. Hole 1: Performance Statistics Are Scattered
      2. 13.2.2. Hole 2: No Reliable and Complete Call Tree
      3. 13.2.3. Hole 3: I/O Attribution
    3. 13.3. Performance Tuning on Linux
      1. 13.3.1. Available Source
      2. 13.3.2. Easy Access to Developers
      3. 13.3.3. Linux Is Still Young
    4. 13.4. Chapter Summary
  19. A. Performance Tool Locations
  20. B. Installing oprofile
    1. B.1 Fedora Core 2 (FC2)
    2. B.2 Enterprise Linux 3 (EL3)
    3. B.3 SUSE 9.1