Systems Performance: Enterprise and the Cloud

Book description

The Complete Guide to Optimizing Systems Performance

Written by the winner of the 2013 LISA Award for Outstanding Achievement in System Administration

Large-scale enterprise, cloud, and virtualized computing systems have introduced serious performance challenges. Now, internationally renowned performance expert Brendan Gregg has brought together proven methodologies, tools, and metrics for analyzing and tuning even the most complex environments. Systems Performance: Enterprise and the Cloud focuses on Linux® and Unix® performance, while illuminating performance issues that are relevant to all operating systems. You’ll gain deep insight into how systems work and perform, and learn methodologies for analyzing and improving system and application performance. Gregg presents examples from bare-metal systems and virtualized cloud tenants running Linux-based Ubuntu®, Fedora®, CentOS, and the illumos-based Joyent® SmartOS™ and OmniTI OmniOS®. He systematically covers modern systems performance, including the “traditional” analysis of CPUs, memory, disks, and networks, and new areas including cloud computing and dynamic tracing. This book also helps you identify and fix the “unknown unknowns” of complex performance: bottlenecks that emerge from elements and interactions you were not aware of. The text concludes with a detailed case study, showing how a real cloud customer issue was analyzed from start to finish.

Coverage includes

• Modern performance analysis and tuning: terminology, concepts, models, methods, and techniques

• Dynamic tracing techniques and tools, including examples of DTrace, SystemTap, and perf

• Kernel internals: uncovering what the OS is doing

• Using system observability tools, interfaces, and frameworks

• Understanding and monitoring application performance

• Optimizing CPUs: processors, cores, hardware threads, caches, interconnects, and kernel scheduling

• Memory optimization: virtual memory, paging, swapping, memory architectures, busses, address spaces, and allocators

• File system I/O, including caching

• Storage devices/controllers, disk I/O workloads, RAID, and kernel I/O

• Network-related performance issues: protocols, sockets, interfaces, and physical connections

• Performance implications of OS and hardware-based virtualization, and new issues encountered with cloud computing

• Benchmarking: getting accurate results and avoiding common mistakes

This guide is indispensable for anyone who operates enterprise or cloud environments: system, network, database, and web admins; developers; and other professionals. For students and others new to optimization, it also provides exercises reflecting Gregg’s extensive instructional experience.

Table of contents

  1. Cover Page
  2. About This eBook
  3. Title Page
  4. Copyright Page
  5. Contents
  6. Preface
    1. About This Book
    2. Operating System Coverage
    3. Other Content
    4. What Isn’t Covered
    5. How This Book Is Structured
    6. As a Future Reference
    7. Tracing Examples
    8. Intended Audience
    9. Typographic Conventions
    10. Supplemental Material and References
  7. Acknowledgments
  8. About the Author
  9. 1. Introduction
    1. 1.1. Systems Performance
    2. 1.2. Roles
    3. 1.3. Activities
    4. 1.4. Perspectives
    5. 1.5. Performance Is Challenging
    6. 1.6. Latency
    7. 1.7. Dynamic Tracing
    8. 1.8. Cloud Computing
    9. 1.9. Case Studies
  10. 2. Methodology
    1. 2.1. Terminology
    2. 2.2. Models
    3. 2.3. Concepts
    4. 2.4. Perspectives
    5. 2.5. Methodology
    6. 2.6. Modeling
    7. 2.7. Capacity Planning
    8. 2.8. Statistics
    9. 2.9. Monitoring
    10. 2.10. Visualizations
    11. 2.11. Exercises
    12. 2.12. References
  11. 3. Operating Systems
    1. 3.1. Terminology
    2. 3.2. Background
    3. 3.3. Kernels
    4. 3.4. Exercises
    5. 3.5. References
  12. 4. Observability Tools
    1. 4.1. Tool Types
    2. 4.2. Observability Sources
    3. 4.3. DTrace
    4. 4.4. SystemTap
    5. 4.5. perf
    6. 4.6. Observing Observability
    7. 4.7. Exercises
    8. 4.8. References
  13. 5. Applications
    1. 5.1. Application Basics
    2. 5.2. Application Performance Techniques
    3. 5.3. Programming Languages
    4. 5.4. Methodology and Analysis
    5. 5.5. Exercises
    6. 5.6. References
  14. 6. CPUs
    1. 6.1. Terminology
    2. 6.2. Models
    3. 6.3. Concepts
    4. 6.4. Architecture
    5. 6.5. Methodology
    6. 6.6. Analysis
    7. 6.7. Experimentation
    8. 6.8. Tuning
    9. 6.9. Exercises
    10. 6.10. References
  15. 7. Memory
    1. 7.1. Terminology
    2. 7.2. Concepts
    3. 7.3. Architecture
    4. 7.4. Methodology
    5. 7.5. Analysis
    6. 7.6. Tuning
    7. 7.7. Exercises
    8. 7.8. References
  16. 8. File Systems
    1. 8.1. Terminology
    2. 8.2. Models
    3. 8.3. Concepts
    4. 8.4. Architecture
    5. 8.5. Methodology
    6. 8.6. Analysis
    7. 8.7. Experimentation
    8. 8.8. Tuning
    9. 8.9. Exercises
    10. 8.10. References
  17. 9. Disks
    1. 9.1. Terminology
    2. 9.2. Models
    3. 9.3. Concepts
    4. 9.4. Architecture
    5. 9.5. Methodology
    6. 9.6. Analysis
    7. 9.7. Experimentation
    8. 9.8. Tuning
    9. 9.9. Exercises
    10. 9.10. References
  18. 10. Network
    1. 10.1. Terminology
    2. 10.2. Models
    3. 10.3. Concepts
    4. 10.4. Architecture
    5. 10.5. Methodology
    6. 10.6. Analysis
    7. 10.7. Experimentation
    8. 10.8. Tuning
    9. 10.9. Exercises
    10. 10.10. References
  19. 11. Cloud Computing
    1. 11.1. Background
    2. 11.2. OS Virtualization
    3. 11.3. Hardware Virtualization
    4. 11.4. Comparisons
    5. 11.5. Exercises
    6. 11.6. References
  20. 12. Benchmarking
    1. 12.1. Background
    2. 12.2. Benchmarking Types
    3. 12.3. Methodology
    4. 12.4. Benchmark Questions
    5. 12.5. Exercises
    6. 12.6. References
  21. 13. Case Study
    1. 13.1. Case Study: The Red Whale
    2. 13.2. Comments
    3. 13.3. Additional Information
    4. 13.4. References
  22. Appendix A. USE Method: Linux
    1. Physical Resources
    2. Software Resources
    3. Reference
  23. Appendix B. USE Method: Solaris
    1. Physical Resources
    2. Software Resources
    3. References
  24. Appendix C. sar Summary
    1. Linux
    2. Solaris
  25. Appendix D. DTrace One-Liners
    1. syscall Provider
    2. proc Provider
    3. profile Provider
    4. sched Provider
    5. fbt Provider
    6. pid Provider
    7. io Provider
    8. sysinfo Provider
    9. vminfo Provider
    10. ip Provider
    11. tcp provider
    12. udp provider
  26. Appendix E. DTrace to SystemTap
    1. Functionality
    2. Terminology
    3. Probes
    4. Built-in Variables
    5. Functions
    6. Example 1: Listing syscall Entry Probes
    7. Example 2: Summarize read() Returned Size
    8. Example 3: Count syscalls by Process Name
    9. Example 4: Count syscalls by syscall Name, for Process ID 123
    10. Example 5: Count syscalls by syscall Name, for “httpd” Processes
    11. Example 6: Trace File open()s with Process Name and Path Name
    12. Example 7: Summarize read() Latency for “mysqld” Processes
    13. Example 8: Trace New Processes with Process Name and Arguments
    14. Example 9: Sample Kernel Stacks at 100 Hz
    15. References
  27. Appendix F. Solutions to Selected Exercises
    1. Chapter 2—Methodology
    2. Chapter 3—Operating Systems
    3. Chapter 6—CPUs
    4. Chapter 7—Memory
    5. Chapter 8—File Systems
    6. Chapter 9—Disks
    7. Chapter 11—Cloud Computing
  28. Appendix G. Systems Performance Who’s Who
  29. Glossary
  30. Bibliography
  31. Index

Product information

  • Title: Systems Performance: Enterprise and the Cloud
  • Author(s): Brendan Gregg
  • Release date: October 2013
  • Publisher(s): Addison-Wesley Professional
  • ISBN: 9780133390124