O'Reilly logo
live online training icon Live Online training

Advanced SQL for Data Analysis (with Python, R, and Java)

Unleashing relational database analytics

Thomas Nield

SQL is often viewed as a means for accessing and writing data and performing rudimentary analysis, but many database platforms offer powerful analytics capabilities beyond aggregations and simple joins. Regular expressions, volatile tables, and cross joins make SQL a powerful platform for data analytics. In addition, subqueries and exotic join patterns can transform data in less obvious ways, and windowing functions can flexibly aggregate on contexts not possible with GROUP BY. Doing intensive analysis with SQL also allows you to put the onus of churning data on the database engine, which is optimized for that purpose. And when SQL lacks the functionality needed for a task, it can be invoked from programming languages like Python, R, and Java to close the gap.

Join expert Thomas Nield for a deep dive into advanced SQL techniques for data analysis. You’ll learn how SQL can answer most common analytics questions—and do it more efficiently.

What you'll learn-and how you can apply it

By the end of this live, online course, you’ll understand:

  • How SQL can be used for deeper and more complex analytical tasks, including windowing functions, regular expressions, and cross joins
  • When to use SQL versus another technology (like R or Python) for a given task

And you’ll be able to:

  • Flexibly derive sets of data and query off of queries
  • Leverage the power of the regular expression to qualify complex text patterns
  • Use SQL joins, including the cross join, in more abstract and powerful ways
  • Leverage windowing functions to get contextual aggregations
  • Use SQL with Python, R, or Java

This training course is for you because...

  • You’re an analyst or data science professional with fundamental SQL proficiency who wants to leverage SQL for more advanced analysis.
  • You’re a programmer or developer who needs to process large amounts of analytical data and wants to leverage more advanced SQL to put that work back on the database.
  • You’re a database administrator who wants to better understand advanced analysis features like windowing functions.

Prerequisites

  • A basic understanding of SQL

About your instructor

  • Thomas Nield (author of Getting Started with SQL) is a business consultant for Southwest Airlines in Schedule Initiatives. Early in his career, he became fascinated with technology and its role in business analytics. After becoming proficient in Java, Kotlin, Python, SQL, and reactive programming, he became an open-source contributor as well as an author/trainer for O’Reilly Media. He is passionate about sharing what he learns and enabling others with new skill sets. He enjoys making technical content relatable and relevant to those unfamiliar with or intimidated by it.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Subqueries, unions, and advanced aggregation (30 minutes)

  • Scalar subqueries
  • Derived tables
  • Joining derived tables
  • Unions
  • Using group_concat()
  • Exercises

Regular expressions (20 minutes)

  • Introduction to regular expressions
  • Using a regular expression to qualify records
  • Exercises

Break (10 minutes)

Advanced joins (40 minutes) - Inner Join/Left Join review - Creating a volatile table - Doing a regex join - Self joins - Cross Joins - Comparative Joins - Exercises

Break (10 minutes)

Windowing functions (40 minutes)
- Partitioning a simple aggregation - Partitioning a rolling aggregation - Partitioning a rolling aggregation by quarter - Applying a ranking - Filtering on a partitioned field - Using the WINDOW keyword to reuse partitions - Exercises

Programming with SQL (20 minutes) - Using SQL with Python - Using SQL with R - Using SQL in Java/Scala/Kotlin