Cover image for High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI

Book description

To the outside world, a "supercomputer" appears to be a single system. In fact, it's a cluster of computers that share a local area network and have the ability to work together on a single problem as a team. Many businesses used to consider supercomputing beyond the reach of their budgets, but new Linux applications have made high-performance clusters more affordable than ever. These days, the promise of low-cost supercomputing is one of the main reasons many businesses choose Linux over other operating systems. This new guide covers everything a newcomer to clustering will need to plan, build, and deploy a high-performance Linux cluster. The book focuses on clustering for high-performance computation, although much of its information also applies to clustering for high-availability (failover and disaster recovery). The book discusses the key tools you'll need to get started, including good practices to use while exploring the tools and growing a system. You'll learn about planning, hardware choices, bulk installation of Linux on multiple systems, and other basic considerations. Then, you'll learn about software options that can save you hours--or even weeks--of deployment time. Since a wide variety of options exist in each area of clustering software, the author discusses the pros and cons of the major free software projects and chooses those that are most likely to be helpful to new cluster administrators and programmers. A few of the projects introduced in the book include:

  • MPI, the most popular programming library for clusters. This book offers simple but realistic introductory examples along with some pointers for advanced use.

  • OSCAR and Rocks, two comprehensive installation and administrative systems

  • openMosix (a convenient tool for distributing jobs), Linux kernel extensions that migrate processes transparently for load balancing

  • PVFS, one of the parallel filesystems that make clustering I/O easier

  • C3, a set of commands for administering multiple systems

Ganglia, OpenPBS, and cloning tools (Kickstart, SIS and G4U) are also covered. The book looks at cluster installation packages (OSCAR & Rocks) and then considers the core packages individually for greater depth or for folks wishing to do a custom installation. Guidelines for debugging, profiling, performance tuning, and managing jobs from multiple users round out this immensely useful book.

Table of Contents

  1. High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI
  2. A Note Regarding Supplemental Files
  3. Preface
    1. Audience
    2. Organization
    3. Conventions
    4. How to Contact Us
    5. Using Code Examples
    6. Acknowledgments
  4. I. An Introduction to Clusters
    1. 1. Cluster Architecture
      1. 1.1. Modern Computing and the Role of Clusters
        1. 1.1.1. Uniprocessor Computers
        2. 1.1.2. Multiple Processors
          1. 1.1.2.1. Centralized multiprocessors
          2. 1.1.2.2. Multicomputers
          3. 1.1.2.3. Cluster structure
      2. 1.2. Types of Clusters
      3. 1.3. Distributed Computing and Clusters
      4. 1.4. Limitations
        1. 1.4.1. Amdahl's Law
      5. 1.5. My Biases
    2. 2. Cluster Planning
      1. 2.1. Design Steps
      2. 2.2. Determining Your Cluster's Mission
        1. 2.2.1. What Is Your User Base?
        2. 2.2.2. How Heavily Will the Cluster Be Used?
        3. 2.2.3. What Kinds of Software Will You Run on the Cluster?
        4. 2.2.4. How Much Control Do You Need?
        5. 2.2.5. Will This Be a Dedicated or Shared Cluster?
        6. 2.2.6. What Resources Do You Have?
        7. 2.2.7. How Will Cluster Access Be Managed?
        8. 2.2.8. What Is the Extent of Your Cluster?
        9. 2.2.9. What Security Concerns Do You Have?
      3. 2.3. Architecture and Cluster Software
        1. 2.3.1. System Software
        2. 2.3.2. Programming Software
        3. 2.3.3. Control and Management
      4. 2.4. Cluster Kits
      5. 2.5. CD-ROM-Based Clusters
        1. 2.5.1. BCCD
      6. 2.6. Benchmarks
    3. 3. Cluster Hardware
      1. 3.1. Design Decisions
        1. 3.1.1. Node Hardware
          1. 3.1.1.1. CPUs and motherboards
          2. 3.1.1.2. Memory and disks
          3. 3.1.1.3. Monitors, keyboards, and mice
          4. 3.1.1.4. Adapters, power supplies, and cases
        2. 3.1.2. Cluster Head and Servers
        3. 3.1.3. Cluster Network
      2. 3.2. Environment
        1. 3.2.1. Cluster Layout
        2. 3.2.2. Power and Air Conditioning
          1. 3.2.2.1. Power
          2. 3.2.2.2. HVAC
        3. 3.2.3. Physical Security
    4. 4. Linux for Clusters
      1. 4.1. Installing Linux
        1. 4.1.1. Selecting a Distribution
        2. 4.1.2. Downloading Linux
        3. 4.1.3. What to Install?
      2. 4.2. Configuring Services
        1. 4.2.1. DHCP
        2. 4.2.2. NFS
          1. 4.2.2.1. Running NFS
          2. 4.2.2.2. Automount
        3. 4.2.3. Other Cluster File System
        4. 4.2.4. SSH
          1. 4.2.4.1. Using SSH
        5. 4.2.5. Other Services and Configuration Tasks
          1. 4.2.5.1. Apache
          2. 4.2.5.2. Network Time Protocol (NTP)
          3. 4.2.5.3. Virtual Network Computing (VNC)
          4. 4.2.5.4. Multicasting
          5. 4.2.5.5. Hosts file and name services
      3. 4.3. Cluster Security
  5. II. Getting Started Quickly
    1. 5. openMosix
      1. 5.1. What Is openMosix?
      2. 5.2. How openMosix Works
      3. 5.3. Selecting an Installation Approach
      4. 5.4. Installing a Precompiled Kernel
        1. 5.4.1. Downloading
        2. 5.4.2. Installing
        3. 5.4.3. Configuration Changes
      5. 5.5. Using openMosix
        1. 5.5.1. User Tools
          1. 5.5.1.1. mps and mtop
          2. 5.5.1.2. migrate
          3. 5.5.1.3. mosctl
          4. 5.5.1.4. mosmon
          5. 5.5.1.5. mosrun
          6. 5.5.1.6. setpe
        2. 5.5.2. openMosixView
        3. 5.5.3. Testing openMosix
      6. 5.6. Recompiling the Kernel
      7. 5.7. Is openMosix Right for You?
    2. 6. OSCAR
      1. 6.1. Why OSCAR?
      2. 6.2. What's in OSCAR
      3. 6.3. Installing OSCAR
        1. 6.3.1. Prerequisites
        2. 6.3.2. Network Configuration
        3. 6.3.3. Loading Software on Your Server
        4. 6.3.4. A Basic OSCAR Installation
          1. 6.3.4.1. Step 0: Downloading additional packages
          2. 6.3.4.2. Step 1: Package selection
          3. 6.3.4.3. Step 2: Configuring packages
          4. 6.3.4.4. Step 3: Installing server software
          5. 6.3.4.5. Step 4: Building a client image
          6. 6.3.4.6. Step 5: Defining clients
          7. 6.3.4.7. Step 6: Setting up the network
          8. 6.3.4.8. Step 7: Completing the setup
          9. 6.3.4.9. Step 8: Testing
        5. 6.3.5. Custom Installations
        6. 6.3.6. Changes OSCAR Makes
        7. 6.3.7. Making Changes
      4. 6.4. Security and OSCAR
        1. 6.4.1. pfilter
        2. 6.4.2. SSH and OPIUM
      5. 6.5. Using switcher
      6. 6.6. Using LAM/MPI with OSCAR
    3. 7. Rocks
      1. 7.1. Installing Rocks
        1. 7.1.1. Prerequisites
        2. 7.1.2. Downloading Rocks
        3. 7.1.3. Installing the Frontend
        4. 7.1.4. Install Compute Nodes
        5. 7.1.5. Customizing the Frontend
          1. 7.1.5.1. User management with 411
          2. 7.1.5.2. X Window System
        6. 7.1.6. Customizing Compute Nodes
          1. 7.1.6.1. Adding packages
          2. 7.1.6.2. Changing disk partitions
          3. 7.1.6.3. Other changes
      2. 7.2. Managing Rocks
      3. 7.3. Using MPICH with Rocks
  6. III. Building Custom Clusters
    1. 8. Cloning Systems
      1. 8.1. Configuring Systems
        1. 8.1.1. Distributing Files
          1. 8.1.1.1. Pushing files with rsync
      2. 8.2. Automating Installations
        1. 8.2.1. Kickstart
          1. 8.2.1.1. Configuration file
          2. 8.2.1.2. Using Kickstart
        2. 8.2.2. g4u
        3. 8.2.3. SystemImager
          1. 8.2.3.1. Image server setup
          2. 8.2.3.2. Golden client setup
          3. 8.2.3.3. Retrieving the image
          4. 8.2.3.4. Cloning the systems
          5. 8.2.3.5. Other tasks
      3. 8.3. Notes for OSCAR and Rocks Users
    2. 9. Programming Software
      1. 9.1. Programming Languages
      2. 9.2. Selecting a Library
      3. 9.3. LAM/MPI
        1. 9.3.1. Installing LAM/MPI
        2. 9.3.2. User Configuration
        3. 9.3.3. Using LAM/MPI
        4. 9.3.4. Testing the Installation
      4. 9.4. MPICH
        1. 9.4.1. Installing
        2. 9.4.2. User Configuration
        3. 9.4.3. Using MPICH
        4. 9.4.4. Testing the Installation
        5. 9.4.5. MPE
      5. 9.5. Other Programming Software
        1. 9.5.1. Debuggers
        2. 9.5.2. HDF5
        3. 9.5.3. SPRNG
      6. 9.6. Notes for OSCAR Users
        1. 9.6.1. Adding MPE
      7. 9.7. Notes for Rocks Users
    3. 10. Management Software
      1. 10.1. C3
        1. 10.1.1. Installing C3
        2. 10.1.2. Using C3 Commands
          1. 10.1.2.1. cexec
          2. 10.1.2.2. cget
          3. 10.1.2.3. ckill
          4. 10.1.2.4. cpush
          5. 10.1.2.5. crm
          6. 10.1.2.6. cshutdown
          7. 10.1.2.7. clist, cname, and cnum
          8. 10.1.2.8. Further examples and comments
      2. 10.2. Ganglia
        1. 10.2.1. Installing and Using Ganglia
          1. 10.2.1.1. RRDTool
          2. 10.2.1.2. Apache and PHP
          3. 10.2.1.3. Ganglia monitor core
          4. 10.2.1.4. Web frontend
      3. 10.3. Notes for OSCAR and Rocks Users
    4. 11. Scheduling Software
      1. 11.1. OpenPBS
        1. 11.1.1. Architecture
        2. 11.1.2. Installing OpenPBS
        3. 11.1.3. Configuring PBS
        4. 11.1.4. Managing PBS
        5. 11.1.5. Using PBS
        6. 11.1.6. PBS's GUI
        7. 11.1.7. Maui Scheduler
      2. 11.2. Notes for OSCAR and Rocks Users
    5. 12. Parallel Filesystems
      1. 12.1. PVFS
        1. 12.1.1. Installing PVFS on the Head Node
        2. 12.1.2. Configuring the Metadata Server
        3. 12.1.3. I/O Server Setup
        4. 12.1.4. Client Setup
        5. 12.1.5. Running PVFS
          1. 12.1.5.1. Troubleshooting
      2. 12.2. Using PVFS
      3. 12.3. Notes for OSCAR and Rocks Users
  7. IV. Cluster Programming
    1. 13. Getting Started with MPI
      1. 13.1. MPI
        1. 13.1.1. Core MPI
          1. 13.1.1.1. MPI_Init
          2. 13.1.1.2. MPI_Finalize
          3. 13.1.1.3. MPI_Comm_size
          4. 13.1.1.4. MPI_Comm_rank
          5. 13.1.1.5. MPI_Get_processor_name
      2. 13.2. A Simple Problem
        1. 13.2.1. Background
        2. 13.2.2. Single-Processor Program
      3. 13.3. An MPI Solution
        1. 13.3.1. A C Solution
        2. 13.3.2. Transferring Data
          1. 13.3.2.1. MPI_Send
          2. 13.3.2.2. MPI_Recv
        3. 13.3.3. MPI Using FORTRAN
        4. 13.3.4. MPI Using C++
      4. 13.4. I/O with MPI
      5. 13.5. Broadcast Communications
        1. 13.5.1. Broadcast Functions
          1. 13.5.1.1. MPI_Bcast
          2. 13.5.1.2. MPI_Reduce
    2. 14. Additional MPI Features
      1. 14.1. More on Point-to-Point Communication
        1. 14.1.1. Non-Blocking Communication
          1. 14.1.1.1. MPI_Isend and MPI_Irecv
          2. 14.1.1.2. MPI_Wait
          3. 14.1.1.3. MPI_Test
          4. 14.1.1.4. MPI_Iprobe
          5. 14.1.1.5. MPI_Cancel
          6. 14.1.1.6. MPI_Sendrecv and MPI_Sendrecv_replace
      2. 14.2. More on Collective Communication
        1. 14.2.1. Gather and Scatter
          1. 14.2.1.1. MPI_Gather
          2. 14.2.1.2. MPI_Scatter
      3. 14.3. Managing Communicators
        1. 14.3.1. Communicator Commands
          1. 14.3.1.1. MPI_Comm_group
          2. 14.3.1.2. MPI_Group_incl and MPI_Group_excl
          3. 14.3.1.3. MPI_Comm_create
          4. 14.3.1.4. MPI_Comm_free and MPI_Group_free
          5. 14.3.1.5. MPI_Comm_split
      4. 14.4. Packaging Data
        1. 14.4.1. User-Defined Types
          1. 14.4.1.1. MPI_Type_struct
          2. 14.4.1.2. MPI_Type_commit
        2. 14.4.2. Packing Data
          1. 14.4.2.1. MPI_Pack
          2. 14.4.2.2. MPI_Unpack
    3. 15. Designing Parallel Programs
      1. 15.1. Overview
      2. 15.2. Problem Decomposition
        1. 15.2.1. Decomposition Strategies
          1. 15.2.1.1. Data decomposition
          2. 15.2.1.2. Control decomposition
      3. 15.3. Mapping Tasks to Processors
        1. 15.3.1. Communication Overhead
        2. 15.3.2. Load Balancing
      4. 15.4. Other Considerations
        1. 15.4.1. Parallel I/O
        2. 15.4.2. MPI-IO Functions
          1. 15.4.2.1. MPI_File_open
          2. 15.4.2.2. MPI_File_seek
          3. 15.4.2.3. MPI_File_read
          4. 15.4.2.4. MPI_File_close
        3. 15.4.3. Random Numbers
    4. 16. Debugging Parallel Programs
      1. 16.1. Debugging and Parallel Programs
      2. 16.2. Avoiding Problems
      3. 16.3. Programming Tools
      4. 16.4. Rereading Code
      5. 16.5. Tracing with printf
      6. 16.6. Symbolic Debuggers
        1. 16.6.1. gdb
        2. 16.6.2. ddd
      7. 16.7. Using gdb and ddd with MPI
      8. 16.8. Notes for OSCAR and Rocks Users
    5. 17. Profiling Parallel Programs
      1. 17.1. Why Profile?
      2. 17.2. Writing and Optimizing Code
      3. 17.3. Timing Complete Programs
      4. 17.4. Timing C Code Segments
        1. 17.4.1. Manual Timing with MPI
        2. 17.4.2. MPI Functions
          1. 17.4.2.1. MPI_Wtime
          2. 17.4.2.2. MPI_Wtick
          3. 17.4.2.3. MPI_Barrier
        3. 17.4.3. PMPI
      5. 17.5. Profilers
        1. 17.5.1. gprof
        2. 17.5.2. gcov
        3. 17.5.3. Profiling Parallel Programs with gprof and gcov
      6. 17.6. MPE
        1. 17.6.1. Using MPE
      7. 17.7. Customized MPE Logging
      8. 17.8. Notes for OSCAR and Rocks Users
  8. V. Appendix
    1. A. References
      1. A.1. Books
      2. A.2. URLs
        1. A.2.1. General Cluster Information
        2. A.2.2. Linux
        3. A.2.3. Cluster Software
        4. A.2.4. Grid Computing and Tools
        5. A.2.5. Cloning and Management Software
        6. A.2.6. Filesystems
        7. A.2.7. Parallel Benchmarks
        8. A.2.8. Programming Software
        9. A.2.9. Scheduling Software
        10. A.2.10. System Software and Utilities
  9. About the Author
  10. Colophon
  11. Copyright