This course shows you how to build data pipelines and automate workflows using Python 3. From simple task-based messaging queues to complex frameworks like Luigi and Airflow, the course delivers the essential knowledge you need to develop your own automation solutions. You'll learn the architecture basics, and receive an introduction to a wide variety of the most popular frameworks and tools.
Designed for the working data professional who is new to the world of data pipelines and distributed solutions, the course requires intermediate level Python experience and the ability to manage your own system set-ups.
- Acquire a practical understanding of how to approach data pipelining using Python toolsets
- Master the ability to determine when a Python framework is appropriate for a project
- Understand workflow concepts like directed acyclic graphs, producers, and consumers
- Learn to integrate data flows into pipelines, workflows, and task-based automation solutions
- Understand how to parallelize data analysis, both locally and in a distributed cluster
- Practice writing simple data tests using property-based testing
Katharine (AKA Kjam) Jarmul is a Python developer, data consultant, and educator who has worked with Python since 2008. Kjam runs kjamistan UG, a Python consulting, training, and competitive analysis company based in Berlin, Germany. She is the author of several O'Reilly titles, including Data Wrangling with Python: Tips and Tools to Make Your Life Easier. She holds an M.A. from American University and an M.S. from Pace University.