You are previewing OpenStack Swift.
O'Reilly logo
OpenStack Swift

Book Description

Get up and running with OpenStack Swift, the free, open source solution for deploying high-performance object storage clusters at scale. In this practical guide, Joe Arnold, co-founder and CEO of SwiftStack, brings you up-to-speed on the basic concepts of object storage and walks you through what you need to know to plan, build, operate, and measure the performance of your own Swift storage system.

Table of Contents

  1. Preface
    1. Why This Book?
    2. Who Should Read This Book?
    3. What’s in This Book?
    4. Conventions Used in This Book
    5. Using Code Examples
    6. Safari® Books Online
    7. How to Contact Us
    8. Acknowledgments
  2. I. Fundamentals and Architecture
    1. 1. The Evolution of Storage
      1. Storage Needs for Today’s Data
        1. The Growth of Data: Exabytes, Hellabytes, and Beyond
        2. Requirements for Storing Unstructured Data
      2. No One-Size-Fits-All Storage System
      3. Object Storage Compared with Other Storage Types
      4. A New Storage Architecture: Software-Defined Storage
      5. Software-Defined Storage Components
        1. Benefits of Software-Defined Storage
      6. Why OpenStack Swift?
      7. Conclusion
    2. 2. Meet Swift
      1. Meet SwiftStack
    3. 3. Swift’s Data Model and Architecture
      1. Swift Data Model
      2. Swift Architecture
      3. Server Processes
      4. Consistency Processes
      5. Locating the Data
        1. Ring Basics: Hash Functions
        2. Ring Basics: Consistent Hashing Ring
        3. The Rings: Modified Consistent Hashing Ring
          1. Partitions
          2. Partition power
          3. Replica count
          4. Replica locks
        4. Distribution of Data
      6. Creating and Updating the Rings
        1. Creating or Updating Builder Files
        2. Rebalancing the Rings
        3. Inside the Rings
      7. Conclusion
    4. 4. Swift Basics
      1. Talking to the Cluster: The Swift API
      2. Sending a Request
        1. Storage URL
        2. Authentication
        3. HTTP Verbs
      3. Authorization and Taking Action
      4. Getting a Response
      5. Communication Tools
        1. Command-Line Interfaces
          1. Using cURL
          2. Using Swift
          3. Swift CLI subcommands
        2. Custom Client Applications
      6. Example Scenarios
      7. Conclusion
  3. II. Application Design with Swift
    1. 5. Overview of the Swift API
      1. What Is an API, Anyway?
      2. The CAP Theorem
      3. Swift’s Sweet Spot: High Availability, Redundancy, and Throughput
      4. Swift API: Background
        1. Review of the Hypertext Transfer Protocol (HTTP)
        2. Representational State Transfer (REST)
        3. Swift, HTTP, and REST
      5. Using the Swift API
        1. About Your Swift Cluster
        2. Authentication
        3. Retrieving Data
        4. Storing Data
        5. Deleting Data
        6. Updating Metadata
      6. Conclusion
    2. 6. Swift Client Libraries
      1. Client Libraries
      2. The Authentication Exchange
      3. Storage Requests: Basic Usage
      4. Client Libraries in Other Languages
        1. Ruby
        2. PHP
        3. Java
      5. Storage Requests: Advanced Usage
      6. Additional Considerations When Using Python
      7. Conclusion
    3. 7. Advanced API Features
      1. Large Objects
      2. Object Versioning
      3. Object Expiration
      4. Temporary URL Middleware (TempURL)
      5. Form Post Middleware
      6. Custom Metadata
      7. PUTting and POSTing Metadata
      8. Cross-Origin Resource Sharing (CORS)
      9. Swift Cluster Info
      10. Range Requests
      11. Domain Remap Middleware
      12. Static Web Hosting
      13. Content-Type Header
      14. Bulk Operations Middleware
      15. Code Samples
        1. Static Large Objects
        2. Dynamic Large Objects
        3. Object Versioning
        4. TempURL (Time-Limited URLs)
        5. Form Post
        6. Cross-Origin Resource Sharing
        7. Custom Metadata
        8. Swift Cluster Info
        9. Range Requests
        10. Domain Remapping
        11. Static Web Hosting
        12. Content-Type
        13. Bulk Upload
        14. Bulk Delete
      16. Conclusion
    4. 8. Developing Swift Middleware
      1. Introduction to WSGI
      2. Programming WSGI
      3. Streaming and Making Modifications to Data
      4. Configuring Middleware Through Paste
      5. How to Write Swift Middleware
      6. Inside Out
      7. Some Simple Examples
      8. Doing More in Middleware
      9. A Look Back and a Look Forward
      10. Conclusion
  4. III. Installing Swift
    1. 9. Installing OpenStack Swift from Source
      1. Downloading OpenStack Swift
        1. Dependencies
        2. Installing the Swift CLI (python-swiftclient)
        3. Installing Swift
        4. Copying in Swift Configuration Files
      2. Configuring Swift
        1. Adding Drives to Swift
          1. Finding drives
          2. Labeling drives
          3. Mounting drives
          4. Swift user
          5. Creating scripts to mount the devices on boot
        2. Storage Policies
          1. Creating storage policies
        3. Creating the Ring Builder Files
          1. The create command
          2. Partition power
          3. Replicas
          4. Minimum part hours
          5. Running the create command
        4. Adding Devices to the Builder Files
          1. Region
          2. Zones
          3. IP
          4. Port
          5. Weight
        5. Adding Drives
        6. Building the Rings
      3. Configuring Swift Logging
        1. Creating the Log Configuration File
        2. Restarting Rsyslog to Begin Swift Logging
      4. Configuring a Proxy Server
        1. Setting the Hash Path Prefix and Suffix
        2. Starting the Proxy Server
      5. Setting up TempAuth Authentication and Authorization with Swift
        1. Starting memcached
        2. Adding Users to proxy-server.conf
        3. Starting the Servers and Restarting the Proxy
        4. Account Authentication
      6. Verifying Account Access
      7. Creating a Container
      8. Uploading an Object
      9. Starting the Consistency Processes
        1. Configuring rsync
        2. Starting the Remaining Consistency Processes
      10. Conclusion
    2. 10. Installing SwiftStack
      1. SwiftStack Controller and Node Overview
        1. SwiftStack Controller
          1. Deployment automation
          2. Ring management
          3. Node and cluster monitoring
        2. SwiftStack Node
      2. Creating a Swift Cluster Using SwiftStack
        1. Creating a SwiftStack Controller User
        2. Installing the SwiftStack Node Software
        3. Claiming a New Node
        4. Creating a Cluster
        5. Ingesting a Node
        6. Enabling a SwiftStack Node
        7. Provisioning a SwiftStack Node
        8. Adding Swift Users
        9. SwiftStack Middleware
        10. Deploying to Cluster
        11. Creating a Container and Uploading an Object via Web Console
      3. Conclusion
  5. IV. Planning a Swift Deployment
    1. 11. Hardware for Swift
      1. Node Hardware Specifications
        1. CPU
          1. Calculating CPU cores
          2. CPU calculation formula
        2. RAM
        3. Drives
      2. Cluster Networking
        1. Network Cards
        2. Outward-Facing Network
        3. Cluster-Facing Network
        4. Replication Network
        5. Out-of-Band Management
        6. Other Networking Connections
      3. Conclusion
    2. 12. Planning a Swift Deployment
      1. Your Use Case
      2. System Design
        1. How Many Nodes?
          1. Total storage
          2. Total drives needed
          3. Total servers (nodes) needed
          4. What now?
          5. Design options
        2. Tiering Node Services
        3. Defining Your Cluster Space
          1. Regions
          2. Zones
          3. Storage policies
        4. Node Naming Conventions
        5. Authentication and Authorization
      3. Networking
        1. Outward-Facing Network
          1. Firewall
          2. Load balancing
        2. Cluster-Facing Network
          1. Hardware management network
          2. Other networking connections
      4. Sample Deployments
        1. Small Cluster: Several Nodes
        2. Medium-Size Cluster: Multi-Rack
        3. Large Cluster: Multi-Region
      5. Conclusion
    3. 13. Authentication and Authorization
      1. Authentication
        1. How Authentication Works
        2. Authentication Request
          1. Authentication URL
          2. Authentication credentials
        3. Authentication Handling
          1. Creating the authentication token
          2. Copying auth token and authorization information in Memcache
        4. Authentication Response
      2. Using the Auth Token in Storage Requests
      3. Authorization
        1. Authorization Examples
        2. How Authorization Works
        3. Storage Request Processing
        4. Token Verification and Authorization Information Lookup
        5. Authorization Callback and Response
      4. Authorization and Access Levels
      5. Account-Level Access Control
        1. Read-Only Access
        2. Read-Write Access
        3. Admin Access
        4. JSON for Account Access Control
      6. Container-Level Access Control
        1. Container ACL Examples
      7. Swift Authentication Systems
        1. Keystone
        2. TempAuth
        3. SWAuth
      8. SwiftStack Authentication Systems
        1. SwiftStack Auth
        2. SwiftStack LDAP
        3. SwiftStack Active Directory
      9. Conclusion
    4. 14. Cluster Tuning and Optimization
      1. Swift Settings
        1. Workers
          1. Proxy server workers
          2. Account and container workers
          3. Object workers and server threads per disk
        2. Chunk Size
        3. Settings for Background Daemons
          1. Auditors
          2. Replicators
          3. Reapers
          4. Updaters
          5. Expirers
      2. Externally Managed Settings
      3. Swift Middleware
        1. Middleware Pipeline
        2. Essential Middleware
          1. TempAuth
          2. KeystoneAuth
          3. Recon
          4. TempURL
          5. Container quotas and account quotas
          6. Dynamic and Static Large Objects
        3. Most Useful Middleware
          1. Rate limiting
          2. Cluster info
          3. Bulk Operations
        4. Other Middleware
          1. Swift3
          2. Cross-domain policies
          3. Name Check
          4. proxy-logging
          5. CatchErrors
          6. GateKeeper
          7. Container sync
      4. The SwiftStack Approach
      5. Conclusion
    5. 15. Operating a Swift Cluster
      1. Operational Considerations
        1. How Swift Distributes Data
        2. Keeping Track of the Rings and Builder Files
      2. Managing Capacity
        1. What to Avoid
        2. Adding Capacity
        3. Existing Cluster: Initial Ring on Node
          1. Adding disks immediately
          2. Adding disks gradually
        4. Adding Nodes
          1. Adding a node immediately
          2. Adding a node gradually
      3. Removing Capacity
        1. Removing Nodes
        2. Removing Disks
          1. Removing disks immediately
          2. Removing disks gradually
      4. Managing Capacity Additions with SwiftStack
        1. Adding Capacity
        2. Adding Drives
        3. Adding a Node
        4. Removing Capacity
        5. Removing a Node
        6. Removing a Disk
      5. Monitoring Your Cluster
        1. Swift-Specific Metrics: What to Look For
        2. Monitoring and Logging Tools
        3. SwiftStack Tools
          1. Cluster-level metrics
          2. Node-level metrics
      6. Operating with SwiftStack
      7. Conclusion
  6. V. Debugging and Troubleshooting
    1. 16. Hardware Failures and Recovery
      1. Handling a Failed Drive
      2. Handling a Full Drive
      3. Handling Sector or Partial Drive Failure (a.k.a. Bit Rot)
      4. Handling Unreachable Nodes
      5. Handling a Failed Node
      6. Node Failure Case Study
      7. Conclusion
    2. 17. Benchmarking
      1. Evaluating Performance
      2. Performance Metrics, Benchmarking, and Testing
        1. Preparing Your Cluster for Benchmarking
        2. Pitfalls and Mistakes to Avoid
        3. Benchmarking Goals and Tools
        4. Don’t Get Greedy
        5. Bottlenecks
      3. Benchmarking with ssbench
        1. Installing ssbench
        2. A Basic ssbench Run
        3. Defining Use Cases
        4. How ssbench Works
        5. Measuring Basic Performance
        6. Taking ssbench Further
        7. Defining the Scenario File
          1. Elements of the scenario file
          2. Running a benchmark test and viewing the output
          3. Useful ssbench options
        8. The ssbench-worker
        9. Ways to Start ssbench-worker
      4. Benchmarking with swift-bench
        1. Preparation
        2. How swift-bench Works
        3. Number of Containers
        4. Testing High Concurrency (-c, -b)
        5. Testing Latency
        6. Object Size (-s, -l)
        7. Number of Objects (-n)
        8. Number of GETs (-g)
        9. Don’t Delete Option (-x)
        10. Creating a Configuration File
        11. Sample swift-bench Run
        12. Running a Distributed swift-bench
        13. Sample swift-bench Configuration
        14. Statistics Tools
      5. Conclusion
    3. A Swift Afterword
      1. The Transition to Object Storage
      2. Why Being Open Matters
      3. The Object-Storage Standard
      4. Now It’s Your Turn
    4. Index
  7. Colophon
  8. Copyright