O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

scikit-learn –Test Predictions Using Various Models

Video Description

A one-stop solution to test model accuracy with cross-validation

About This Video

  • Optimizing the ridge regression parameter
  • Analyze and plot an ROC curve without context
  • Dummy Estimators and Persisting models with joblib
  • Using k-means for outlier detection

In Detail

Scikit-learn has evolved as a robust library for machine learning applications in Python with support for a wide range of supervised and unsupervised learning algorithms.

This course begins by taking you through videos on linear models; with scikit-learn, you will take a machine learning approach to linear regression. As you progress, you will explore logistic regression. Then you will build models with distance metrics, including clustering. You will also look at cross-validation and post-model workflows, where you will see how to select a model that predicts well. Finally, you'll work with Support Vector Machines to get a rough idea of how SVMs work, and also learn about the radial basis function (RBF) kernel.

Table of Contents

  1. Chapter 1 : Linear Models with scikit-learn
    1. The Course Overview 00:03:43
    2. Fitting a Line Through Data 00:05:23
    3. Evaluating and Overcoming Shortfalls of the Linear Regression Model 00:08:02
    4. Optimizing the Ridge Regression Parameter 00:04:02
    5. Using Sparsity to Regularize Models 00:03:24
    6. Fundamental Approach to Regularization with LARS 00:03:26
  2. Chapter 2 : Linear Models – Logistic Regression
    1. Exploring Various Repositories and Datasets 00:05:50
    2. Logistic Regression and Confusion Matrix 00:04:25
    3. Varying the Classification Threshold in Logistic Regression 00:05:51
    4. Analysis and Plotting an ROC Curve Without Context 00:06:36
    5. UCI Breast Cancer Dataset 00:03:22
  3. Chapter 3 : Building Models with Distance Metrics
    1. In a dataset, we observe sets of points gathered together. With k-means, we will categorize all the points into groups, or clusters. 00:08:56
    2. Handling Data and Quantizing an Image 00:07:09
    3. Finding the Closest Object in the Feature Space 00:03:34
    4. Probabilistic Clustering with Gaussian Mixture Models 00:04:07
    5. Using k-means for Outlier Detection 00:03:13
    6. Using KNN for Regression 00:04:16
  4. Chapter 4 : Cross-Validation and Post-Model Workflow
    1. Cross-Validation 00:08:23
    2. Search with scikit-learn 00:04:03
    3. Metrics 00:07:39
    4. Dummy Estimators and Persisting Models with joblib 00:03:41
    5. Feature Selection 00:06:32
  5. Chapter 5 : Support Vector Machines
    1. Classifying Data with a Linear SVM 00:05:02
    2. Optimizing an SVM 00:05:30
    3. Multiclass Classification with SVM 00:03:44
    4. Support Vector Regression 00:02:44