You are previewing Getting Started with Amazon Redshift.
O'Reilly logo
Getting Started with Amazon Redshift

Book Description

Start by learning the fundamentals and then progress to creating and managing your own Redshift cluster. This guide walks you step-by-step through the world of big data, cloud computing, and scalable data warehousing.

  • Step-by-step instructions to create and manage your Redshift cluster

  • Understand the technology behind the database engine, as you learn about compression, block level storage, and column stores

  • Learn the implementation and database design considerations you will need to understand to successfully implement your own Amazon Redshift cluster

  • In Detail

    Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service. It provides an excellent approach to analyzing all your data using your existing business intelligence tools.

    Getting Started With Amazon Redshift is an easy-to-read, descriptive guide that breaks down the complex topics of data warehousing and Amazon Redshift. You will learn the fundamentals of Redshift technology and how to implement your own Redshift cluster, through practical, real-world examples. This exciting new technology is a powerful tool in your arsenal of data management and this book is a must-have to implement and manage your next enterprise Data Warehouse.

    Packed with detailed descriptions, diagrams, and explanations, Getting Started With Amazon Redshift will bring you along, regardless of your current level of understanding, to a point where you will feel comfortable with running your own Redshift cluster. The author's own experiences will give you an understanding of what you will need to consider when working with your own data. You will also learn about how compression has been implemented and what that means relative to a column store database structure. As you progress, you will gain an understanding of monitoring techniques, performance considerations, and what it will take to successfully run your Amazon Redshift cluster on a day-to-day basis. There truly is something in this book for everyone who is interested in learning about this technology.

    Table of Contents

    1. Getting Started with Amazon Redshift
      1. Table of Contents
      2. Getting Started with Amazon Redshift
      3. Credits
      4. About the Author
      5. About the Reviewers
        1. Support files, eBooks, discount offers and more
          1. Why Subscribe?
          2. Free Access for Packt account holders
          3. Instant Updates on New Packt Books
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. Questions
      8. 1. Overview
        1. Pricing
        2. Configuration options
        3. Data storage
        4. Considerations for your environment
        5. Summary
      9. 2. Transition to Redshift
        1. Cluster configurations
        2. Cluster creation
        3. Cluster details
        4. SQL Workbench and other query tools
        5. Unsupported features
        6. Command line
        7. The PSQL command line
          1. Connection options
          2. Output format options
          3. General options
          4. API
        8. Summary
      10. 3. Loading Your Data to Redshift
        1. Datatypes
        2. Schemas
          1. Table creation
        3. Connecting to S3
        4. The copy command
        5. Load troubleshooting
        6. ETL products
        7. Performance monitoring
        8. Indexing strategies
        9. Sort keys
        10. Distribution keys
        11. Summary
      11. 4. Managing Your Data
        1. Backup and recovery
        2. Resize
        3. Table maintenance
        4. Workload Management (WLM)
        5. Compression
        6. Streaming data
        7. Query optimizer
        8. Summary
      12. 5. Querying Data
        1. SQL syntax considerations
        2. Query performance monitoring
        3. Explain plans
          1. Sequential scan
          2. Joins
          3. Sorts and aggregations
        4. Working with tables
          1. Insert/update
          2. Alter
        5. Summary
      13. 6. Best Practices
        1. Security
        2. Cluster configuration
        3. Database maintenance
        4. Cluster operation
        5. Database design
        6. Monitoring
        7. Data processing
        8. Summary
      14. A. Reference Materials
        1. Cluster terminology
        2. Compression
        3. Datatypes
        4. SQL commands
        5. System tables
        6. Third-party tools and software
      15. Index