O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Troubleshooting Python Machine Learning

Video Description

Practical and unique solutions to common Machine Learning problems that you face. Avoid any roadblocks while working with the Python data science ecosyste.

About This Video

  • Covering common ML problems documented online (leveraging sources such as Stack Overflow, Medium, and GitHub) and solved entirely in this one course.
  • Each video is constructed in a problem-solution format, making it easy to understand the problem and grasp the solution.
  • Tried and tested solutions to solving common problems, while implementing Machine learning with Python

In Detail

You are a data scientist. Every day, you stare at reams of data trying to apply the latest and brightest of models to uncover new insights, but there seems to be an endless supply of obstacles. Your colleagues depend on you to monetize your firm's data - and the clock is ticking. What do you do?

Troubleshooting Python Machine Learning is the answer. We have systematically researched common ML problems documented online around data wrangling, debugging models such as Random Forests and SVMs, and visualizing tricky results. We leverage statistics from Stack Overflow, Medium, and GitHub to get a cross-section of what data scientists struggle with. We have collated for you the top issues, such as retrieving the most important regression features and explaining your results after clustering, and their corresponding solutions. We present these case studies in a problem-solution format, making it very easy for you to incorporate this into your knowledge.
Taking this course will help you to precisely debug your models and research pipelines, so you can focus on pitching new ideas and not fixing old bugs.

All the code and supporting files are available on GitHub at - https://github.com/PacktPublishing/Troubleshooting-Python-Machine-Learning-

Table of Contents

  1. Chapter 1 : Eliminate Common Data Wrangling Problems in Pandas and scikit-learn
    1. The Course Overview 00:02:14
    2. Splitting Your Datasets for Train, Test, and Validate 00:11:41
    3. Persist Your Hard Earned Models by Saving Them to Disk 00:11:25
    4. Calculate Word Frequencies Efficiently in Good ol' Python 00:07:08
    5. Transform Your Variable Length Features into One-Hot Vectors 00:09:04
  2. Chapter 2 : Defeat Regression and Classification Difficulties in scikit-learn
    1. Finding the Most Important Features in Your Classifier 00:10:19
    2. Predicting Multiple Targets with the Same Dataset 00:09:05
    3. Retrieving the Best Estimators after Grid Search 00:10:50
    4. Regress on Your Pandas Data Frame with Simple Statsmodels OLS 00:11:09
  3. Chapter 3 : Troubleshooting Advanced Models like Random Forests and SVMs
    1. Extracting Decision Tree Rules from scikit-learn 00:11:25
    2. Finding Out Which Features Are Important in a Random Forest Model 00:08:59
    3. Classifying with SVMs When Your Data Has Unbalanced Classes 00:12:26
    4. Computing True/False Positives/Negatives after in scikit-learn 00:10:14
  4. Chapter 4 : Wrangling with the Unsupervised Learning and Curse of Dimensionality
    1. Labelling Dimensions with Original Feature Names after PCA 00:09:17
    2. Clustering Text Documents with scikit-learn K-means 00:09:18
    3. Listing Word Frequency in a Corpus Using Only scikit-learn 00:06:38
    4. Polynomial Kernel Regression Using Pipelines 00:10:12
  5. Chapter 5 : Solving Prediction Visualization Issues with Matplotlib
    1. Visualize Outputs Over Two-Dimensions Using NumPy's Meshgrid 00:09:25
    2. Drawing Out a Decision Tree Trained in scikit-learn 00:08:08
    3. Clarify Your Histogram by Labelling Each Bin 00:09:29
    4. Centralizing Your Color Legend When You Have Multiple Subplots 00:08:57