You are previewing IBM System Blue Gene Solution Blue Gene/Q Application Development.
O'Reilly logo
IBM System Blue Gene Solution Blue Gene/Q Application Development

Book Description

This IBM® Redbooks® publication is one in a series of IBM books written specifically for the IBM System Blue Gene® supercomputer, Blue Gene/Q®, which is the third generation of massively parallel supercomputers from IBM in the Blue Gene series. This document provides an overview of the application development environment for the Blue Gene/Q system. It describes the requirements to develop applications on this high-performance supercomputer.

This book explains the unique Blue Gene/Q programming environment. This book does not provide detailed descriptions of the technologies that are commonly used in the supercomputing industry, such as Message Passing Interface (MPI) and Open Multi-Processing (OpenMP). References to more detailed information about programming and technology are provided.

This document assumes that readers have a strong background in high-performance computing (HPC) programming. The high-level programming languages that are used throughout this book are C/C++ and Fortran95. For more information about the Blue Gene/Q system, see "IBM Redbooks" on page 159.

Table of Contents

  1. Front cover
  2. Notices
    1. Trademarks
  3. Preface
    1. Author
    2. Now you can become a published author, too!
    3. Comments welcome
    4. Stay connected to IBM Redbooks
  4. Summary of changes
    1. June 2013, Second Edition
  5. Chapter 1. System overview
    1. 1.1 Blue Gene/Q environment overview
    2. 1.2 Blue Gene/Q hardware overview
    3. 1.3 Blue Gene/Q software overview
      1. 1.3.1 System administration and management
      2. 1.3.2 Compute Node Kernel and services
      3. 1.3.3 I/O node kernel and services
      4. 1.3.4 Message Passing Interface
      5. 1.3.5 Compilers
      6. 1.3.6 Application development and debugging
  6. Chapter 2 Kernel functionality
    1. 2.1 Compute Node Kernel
      1. 2.1.1 Stateless compute nodes
      2. 2.1.2 Firmware
    2. 2.2 Role of the I/O node kernel
  7. Chapter 3. Processes
    1. 3.1 Importance of process count
    2. 3.2 Process creation
    3. 3.3 Processes per node
    4. 3.4 Determining how many processes per node to use
    5. 3.5 Specifying process count
    6. 3.6 Support for 64-bit applications
    7. 3.7 Object identifiers
      1. 3.7.1 Process identifier
      2. 3.7.2 Thread identifier
      3. 3.7.3 Thread group identifier
      4. 3.7.4 T coordinate
    8. 3.8 Sub-node jobs
    9. 3.9 Threading overview
      1. 3.9.1 Hardware thread over-commitment
    10. 3.10 Thread scheduler
      1. 3.10.1 Thread preemption
      2. 3.10.2 Thread yield
      3. 3.10.3 Round-robin dispatch
    11. 3.11 Thread affinity
      1. 3.11.1 Breadth-first assignment
      2. 3.11.2 Depth-first assignment
      3. 3.11.3 Thread affinity control
      4. 3.11.4 Setting affinity with the pthread attribute
      5. 3.11.5 Setting affinity with the system call
      6. 3.11.6 Extended thread affinity control
    12. 3.12 Thread priority
      1. 3.12.1 Setting priority through the pthread attribute
      2. 3.12.2 Explicit setting of priority
      3. 3.12.3 Hardware thread priority
  8. Chapter 4. Memory
    1. 4.1 Memory system overview
      1. 4.1.1 L1 prefetch cache overview
      2. 4.1.2 L2 cache functional overview
      3. 4.1.3 Boot eDRAM overview
    2. 4.2 Memory management
    3. 4.3 Memory protection
    4. 4.4 Shared memory
    5. 4.5 Persistent memory
    6. 4.6 Compute node ramdisk
    7. 4.7 Support for the /proc file system
    8. 4.8 L1P prefetcher
      1. 4.8.1 Linear stream prefetcher overview
      2. 4.8.2 Perfect prefetcher overview
      3. 4.8.3 L1P prefetcher API descriptions
      4. 4.8.4 Performance considerations
    9. 4.9 L2 atomic operations
    10. 4.10 Speculative execution
    11. 4.11 Support for dynamic linking
    12. 4.12 Transactional memory
  9. Chapter 5. Compute Node Kernel interfaces
    1. 5.1 Lightweight principles
    2. 5.2 Kernel access
      1. 5.2.1 Application programming interfaces
      2. 5.2.2 System programming interface
    3. 5.3 System calls
  10. Chapter 6. Parallel paradigms
    1. 6.1 Programming model
    2. 6.2 Blue Gene/Q MPI implementation
      1. 6.2.1 High-performance network for efficient parallel execution
      2. 6.2.2 Forcing MPI to allocate too much memory
      3. 6.2.3 Not waiting for the MPI_Test function
      4. 6.2.4 Flooding the network with messages
      5. 6.2.5 Deadlocking the system
      6. 6.2.6 Violating MPI buffer ownership rules
      7. 6.2.7 Buffer alignment sensitivity
    3. 6.3 Blue Gene/Q MPI extensions
      1. 6.3.1 Changing class-route usage at run time
      2. 6.3.2 Determining hardware properties
    4. 6.4 MPI functions
    5. 6.5 Compiling MPI programs on the Blue Gene/Q system
    6. 6.6 OpenMP
      1. 6.6.1 OpenMP implementation for Blue Gene/Q
    7. 6.7 Multiple Program, Multiple Data
  11. Chapter 7. Developing applications with Blue Gene/Q compilers
    1. 7.1 Programming environment overview
    2. 7.2 Compilers for the Blue Gene/Q system
      1. 7.2.1 IBM XL compilers
      2. 7.2.2 GNU Compiler Collection
      3. 7.2.3 Python interpreter
      4. 7.2.4 Toolchain tools
    3. 7.3 Compiling and linking applications on the Blue Gene/Q system
    4. 7.4 Compiler options specific to the Blue Gene/Q system
      1. 7.4.1 Options for the Blue Gene/Q system
      2. 7.4.2 Unsupported compiler options
    5. 7.5 Support for pthreads and OpenMP
      1. 7.5.1 Thread stack size for the Blue Gene/Q system
    6. 7.6 Creating libraries on the Blue Gene/Q system
    7. 7.7 Running dynamically linked applications on the Blue Gene/Q system
      1. 7.7.1 Creating a program
      2. 7.7.2 Creating a shared library
      3. 7.7.3 Running a Blue Gene/Q dynamically linked program on a front end node
      4. 7.7.4 Running a dynamically linked program on the Blue Gene/Q system
      5. 7.7.5 Tools for dynamic linking
    8. 7.8 Mathematical Acceleration Subsystem Libraries
    9. 7.9 Engineering and Scientific Subroutine Libraries
    10. 7.10 Cross-compilation on the Blue Gene/Q system
      1. 7.10.1 Configuring and building on an I/O node used as a front end node
      2. 7.10.2 Using implicit program launching from a front end node
    11. 7.11 Python support
      1. 7.11.1 Using the Python interpreter in a cross-compiled environment
      2. 7.11.2 Running the Python interpreter on the Blue Gene/Q system
    12. 7.12 Using the QPX floating‑point unit
      1. 7.12.1 Using SIMD instructions in applications
  12. Chapter 8. Running and debugging applications
    1. 8.1 Running applications
      1. 8.1.1 IBM LoadLeveler
    2. 8.2 Debugging applications
      1. 8.2.1 General debugging architecture
      2. 8.2.2 GNU Project Debugger
      3. 8.2.3 Coreprocessor debugger
      4. 8.2.4 The addr2line utility
    3. 8.3 What to do when a job fails
    4. 8.4 Debugging jobs
      1. 8.4.1 The snapbug tool
      2. 8.4.2 The Coreprocessor tool
  13. Appendix A. Mapping
    1. Mapping overview
    2. General guidance
  14. Appendix B. Blue Gene/Q personality
    1. Personality of Blue Gene/Q nodes
    2. Examples of retrieving Blue Gene/Q personality information
  15. Appendix C. PAMI and MPI header files and libraries
    1. Blue Gene/Q applications
  16. Appendix D. MPI and CNK environment variables
    1. Message Passing Interface environment variables
    2. Compute Node Kernel environment variables
    3. Setting environment variables
  17. Appendix E. Using GNU profiling
    1. Using the Blue Gene/Q gmon tool
    2. Profiling with the GNU toolchain
  18. Appendix F. Hardware performance counters
    1. Blue Gene Hardware Performance Monitoring API
    2. Performance Application Programming Interface
  19. Appendix G. Requirements for C++ programming in a failover environment
  20. Related publications
    1. IBM Redbooks
    2. Other publications
    3. Online resources
    4. How to get IBM Redbooks
    5. Help from IBM
  21. References
  22. Back cover