O'Reilly logo
live online training icon Live Online training

Reactive Python for Data Science: Production-Ready, Scalable Code for Real-Time Data

Learn the basics of reactive programming for more resilient, event-driven code models

Thomas Nield

Reactive programming is a radically effective approach to compose data as queryable, live streams. With reactive programming, you can concisely wrangle and analyze not only static data but also real-time, infinite feeds (for example live Twitter streams and stock quotes). Code readability, composability, and scalability become trivial to implement. And analysis code can quickly be turned into production code, and continuously adapted to reflect an evolving business environment.

Reactive Extensions (also called “ReactiveX” or “Rx”) have been ported to over a dozen major languages and platforms. In this course, you will learn how to leverage RxPy, a lightweight ReactiveX library for Python, to tactically and effectively work with data. You'll learn how to create more robust Python code, to save time and increase productivity. You'll learn to leverage effective concurrency with minimal effort, and develop code that’s production-ready, reusable, and changeable.

What you'll learn-and how you can apply it

  • How to solve problems using push-based iteration, as opposed to traditional pull-based iteration
  • The various chain-like operators in Rx to compose business logic and concurrency through data streams

And you'll be able to:

  • Create concise, readable, and maintainable Python code
  • Compose real-time events and data together into single streams
  • Take advantage of RxPy's numerous operators to express business logic
  • Easily leverage concurrency to scale and manage workloads
  • Handle errors effectively
  • Leverage RxPy to create more robust Python code in all data science tasks (reading, writing, wrangling, analysis, etc)
  • Leverage effective concurrency with minimal effort
  • Save time and increase productivity using not only less code, but more reusable code

This training course is for you because...

You are a data scientist or business analyst with a fundamental grasp of Python, and need to find ways to express logic more easily as well as easily scale your code into a production environment.

You are a programmer familiar with Scala, Java 8, C#, Swift, or Kotlin (all of which have ReactiveX ports and would like to apply modern higher-order functional chain patterns to your Python data analysis workflow.

Prerequisites

You should be comfortable with Python’s core language features including variables, functions, collections (Lists and Dicts), iteration, and classes. It is helpful to know lambdas, SQL, and inheritance but not critical to follow along.

Before the course begins, you'll need to download Python 3.x, a Python development environment (preferably PyCharm), and the packages RxPy, SQLAlchemy, and Tweepy.

GitHub Repo with Course Materials

Recommended Preparation:

Introduction to Python

INTERMEDIATE PYTHON PROGRAMMING

About your instructor

  • Thomas Nield (author of Getting Started with SQL) has a business analyst background and works at Southwest Airlines in Revenue Management. Early in his career he became fascinated with technology and bought dozens of books to master programming in Java, C#, Kotlin, and database design. He is passionate about sharing what he learns and enabling others with new skillsets, even if they do not work in IT. He enjoys making technical content relatable and relevant to those unfamiliar or intimidated by it.

    Thomas has developed several database-driven applications for Southwest Airlines that generate revenue for the entire airline network. He believes technology should conform to the business, and emphasizes usefulness and real-world practicality while balancing the perspectives of IT and business professionals.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Why Reactive Programming? (5 min)

Thinking Reactively (6 min)

The Observable (28 min)

  • The Observable [6 min]
  • Operators and other Sources [7 min]
  • Creating Observables and Intervals [7 min]
  • Hot and Cold Observables [4 min]
  • Exercises [4 min]

Operators (26 min)

  • Filter and Take [5 min]
  • Distinct Operators [4 min]
  • Reduce and Scan [7 min]
  • Lists and Dicts [6 min]
  • Exercises [4 min]

Combining Observables (24 min)

  • Merging [8 min]
  • Concatenating and Zipping [6 min]
  • Group By [6  min]
  • Exercise [4 min]

Reading and Analyzing data (25 min)

  • Reading text files and URL's [6 min]
  • Querying SQLAlchemy [8 min]
  • PROJECT: Scheduling a reactive word counter [11 min]

Deferment and Hot Observables (32 min)

  • Using Observable.defer() [6 min]
  • Multicasting [6 min]
  • Subjects [4 min]
  • Exercises [4 min]
  • PROJECT: Creating an Observable off a Twitter Stream [12 min]

Concurrency (29 min)

  • Concurrency and Debugging [7 min]
  • Using subscribe_on() [6 min]
  • Using observe_on() [4 min]
  • Achieving parallelization [4 min]
  • Using do_action() to debug [4 min]
  • Exercise [4 min]
  • Going Forward (7 min)