O'Reilly logo
live online training icon Live Online training

Managing Enterprise Data Strategies with Hadoop, Spark, and Kafka

Learn how to ensure the success of your data pipeline project and avoid common mistakes

Jesse Anderson

Big data projects can be a huge investment, and even small implementation failures can be time-consuming and expensive. This hands-on course pays for itself by letting you make mistakes during the planning phase instead of the expensive development phases.

You don’t need deep technical knowledge to manage successful data strategies, but you do need an understanding of both the pitfalls and potential data holds. In just one online session, Jesse Anderson will show you how to recognize the opportunities, avoid the problems, and get the most value from your data.

What you'll learn-and how you can apply it

By the end of this hands-on, online course, you’ll be able to:

  • Understand how Hadoop, Spark, Kafka (and other data tools) fit together in the big data ecosystem
  • Know the steps you need to take to solve problems specific to your team and use case(s)
  • Avoid common mistakes made in development
  • Do data science on the fly using real datasets
  • Determine the problems you can solve with data (and address them in your data solution from the beginning)
  • Ensure the success of your Hadoop roll-out or big data project

This training course is for you because...

  • You’re a CxO, VP, or technical manager with some familiarity with big data terminology and specific data problems to solve
  • You have business experience and are transitioning into a career in big data

About your instructor

  • Jesse Anderson is a creative engineer with many years of experience in creating products and helping companies improve their software engineering. He is CEO of Smoking Hand, a training company for big data technologies. Smoking Hand helps companies make their big data transformations and enables their successful projects.

    Jesse previously created big data and data science curriculum for Cloudera, leading instruction for thousands of people entering this field, and has played an active part within the Apache Hadoop community, creating many popular open source examples for big data use cases.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

  • Thinking in big data: Understanding big data, Hadoop, and its ecosystem, part 1
  • Break Time (15 minutes)
  • Thinking in big data: Understanding big data, Hadoop, and its ecosystem, part 2
  • Lunch (1 hour)
  • Engineering big data solutions: The steps and mindsets to creating successful big data solutions
  • Break Time(15 minutes)
  • Doing data science on the NFL play-by-play dataset: Using data to create data products and find business value
  • Q&A