You are previewing Hands-On Machine Learning with Scikit-Learn and TensorFlow.
O'Reilly logo
Hands-On Machine Learning with Scikit-Learn and TensorFlow

Book Description

A series of Deep Learning breakthroughs have boosted the whole field of machine learning over the last decade. Now that machine learning is thriving, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how. By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems.

Table of Contents

  1. Preface
    1. The Machine Learning tsunami
    2. Machine Learning in your projects
    3. Objective and approach
    4. Prerequisites
    5. Roadmap
    6. Other resources
    7. Conventions Used in This Book
    8. Using Code Examples
    9. Safari® Books Online
    10. How to Contact Us
    11. Acknowledgments
  2. I. The Fundamentals of Machine Learning
  3. 1. The Machine Learning landscape
    1. What is Machine Learning?
    2. Why use Machine Learning?
    3. Types of Machine Learning systems
      1. Supervised/Unsupervised learning
      2. Batch and Online learning
      3. Instance based vs Model based learning
    4. Main challenges of Machine Learning
      1. Insufficient quantity of training data
      2. Non representative training data
      3. Poor quality data
      4. Irrelevant features
      5. Overfitting the training data
      6. Underfitting the training data
      7. Stepping back
    5. Testing and validating
    6. Exercises
  4. 2. End-to-end Machine Learning project
    1. Working with real data
    2. Look at the big picture
      1. Frame the problem
      2. Select a performance measure
      3. Check the assumptions
    3. Get the data
      1. Create the workspace
      2. Download the data
      3. Take a quick look at the data structure
      4. Create a test set
    4. Discover and visualize the data to gain insights
      1. Visualizing geographical data
      2. Looking for correlations
      3. Experimenting with attribute combinations
    5. Prepare the data for Machine Learning algorithms
      1. Data cleaning
      2. Handling text and categorical attributes
      3. Custom transformers
      4. Feature scaling
      5. Transformation pipelines
    6. Select and train a model
      1. Training and evaluating on the training set
      2. Better evaluation using cross validation
    7. Fine-tune your model
      1. Grid search
      2. Randomized search
      3. Ensemble methods
      4. Analyze the best models and their errors
      5. Evaluate your system on the test set
    8. Launch, monitor and maintain your system
    9. Try it out!
    10. Exercises
  5. 3. Classification
    1. MNIST
    2. Training a binary classifier
    3. Performance measures
      1. Measuring accuracy using cross-validation
      2. Confusion matrix
      3. Precision and recall
      4. Precision/Recall tradeoff
      5. The ROC curve
    4. Multiclass classification
    5. Error analysis
    6. Multilabel classification
    7. Multioutput classification
    8. Exercises
  6. 4. Training linear models
    1. Linear Regression
      1. The Normal Equation
      2. Computational complexity
    2. Gradient Descent
      1. Batch Gradient Descent
      2. Stochastic Gradient Descent
      3. Mini-batch Gradient Descent
    3. Polynomial regression
    4. Learning curves
    5. Regularized linear models
      1. Ridge regression
      2. Lasso regression
      3. Elastic Net
      4. Early stopping
    6. Logistic Regression
      1. Estimating probabilities
      2. Training and cost function
      3. Decision boundaries
      4. Softmax regression
    7. Exercises
  7. 5. Support Vector Machines
    1. Linear SVM classification
      1. Soft margin classification
    2. Non-linear SVM classification
      1. Polynomial kernel
      2. Adding similarity features
      3. Gaussian RBF kernel
      4. Computational complexity
    3. SVM regression
    4. Under the hood
      1. Decision function and predictions
      2. Training objective
      3. Quadratic Programming
      4. The dual problem
      5. Kernelized SVM
      6. Online SVMs
    5. Exercises
  8. 6. Decision Trees
    1. Training and visualizing a Decision Tree
    2. Making predictions
    3. Estimating class probabilities
    4. The CART training algorithm
    5. Computational complexity
    6. Gini impurity or entropy?
    7. Regularization hyperparameters
    8. Regression
    9. Instability
    10. Exercises
  9. 7. Ensemble Learning and Random Forests
    1. Voting Classifiers
    2. Bagging and Pasting
      1. Bagging and Pasting in Scikit-Learn
      2. Out-of-bag evaluation
    3. Random Patches and Random Subspaces
    4. Random Forests
      1. Extra-Trees
      2. Feature importance
    5. Boosting
      1. AdaBoost
      2. Gradient Boosting
    6. Stacking
    7. Exercises
  10. 8. Dimensionality Reduction
    1. The curse of dimensionality
    2. Main approaches for dimensionality reduction
      1. Projection
      2. Manifold learning
    3. PCA
      1. Preserving the variance
      2. Principal Components
      3. Projecting down to dimensions
      4. Using Scikit-Learn
      5. Explained variance ratio
      6. Choosing the right number of dimensions
      7. PCA for compression
      8. Incremental PCA
      9. RandomizedPCA
    4. Kernel PCA
      1. Selecting a kernel and tuning hyperparameters
    5. LLE
    6. Other dimensionality reduction techniques
    7. Exercises
  11. II. Neural Networks and Deep Learning
  12. 9. Up and running with TensorFlow
    1. Installation
    2. Creating your first graph and running it in a session
    3. Managing graphs
    4. Lifecycle of a node value
    5. Linear Regression with TensorFlow
    6. Implementing Gradient Descent
      1. Manually computing the gradients
      2. Using autodiff
      3. Using an optimizer
    7. Feeding data to the training algorithm
    8. Saving and restoring models
    9. Visualizing the graph and training curves using TensorBoard
    10. Name scopes
    11. Modularity
    12. Sharing variables
    13. Exercises
  13. 10. Introduction to Artificial Neural Networks
    1. From biological to artificial neurons
      1. Biological neurons
      2. Logical computations with neurons
      3. The Perceptron
      4. Multi-Layer Perceptron and Backpropagation
    2. Training an MLP with TensorFlow’s high level API
    3. Training a DNN using plain TensorFlow
      1. Construction phase
      2. Execution phase
      3. Using the neural network
    4. Fine-tuning neural network hyperparameters
      1. Number of hidden layers
      2. Number of neurons per hidden layer
      3. Activation functions
    5. Exercises
  14. 11. Deep Learning
    1. Vanishing/exploding gradients problems
      1. Xavier and He initialization
      2. Non-saturating activation functions
      3. Batch Normalization
      4. Gradient clipping
    2. Reusing pretrained layers
      1. Reusing a TensorFlow model
      2. Reusing models from other frameworks
      3. Freezing the lower layers
      4. Caching the frozen layers
      5. Tweaking, dropping or replacing the upper layers
      6. Model zoos
      7. Unsupervised pretraining
      8. Pretraining on an auxiliary task
    3. Faster optimizers
      1. Momentum optimization
      2. Nesterov Momentum optimization
      3. AdaGrad
      4. RMSProp
      5. Adam optimization
      6. Learning rate scheduling
    4. Avoiding overfitting through regularization
      1. Early Stopping
      2. and regularization
      3. Dropout
      4. Max-norm regularization
      5. Data augmentation
    5. Practical guidelines
    6. Exercises
  15. 12. Distributing TensorFlow across devices and servers
    1. Multiple devices on a single machine
      1. Installation
      2. Managing the GPU RAM
      3. Placing operations on devices
      4. Parallel execution
      5. Control dependencies
    2. Multiple devices across multiple servers
      1. Opening a session
      2. The master and worker services
      3. Pinning operations across tasks
      4. Sharding variables across multiple parameter servers
      5. Sharing state across sessions using resource containers
      6. Asynchronous communication using TensorFlow queues
      7. Loading data directly from the graph
    3. Parallelizing neural networks on a TensorFlow cluster
      1. One neural network per device
      2. In-graph vs between-graph replication
      3. Model parallelism
      4. Data parallelism
    4. Exercises
  16. 13. Convolutional Neural Networks
    1. The architecture of the visual cortex
    2. Convolutional layer
      1. Filters
      2. Stacking multiple feature maps
      3. TensorFlow implementation
      4. Memory requirement
    3. Pooling layer
    4. CNN architectures
      1. LeNet-5
      2. AlexNet
      3. GoogLeNet
      4. ResNet
    5. Exercises
  17. A. Exercise solutions
    1. Chapter 1
    2. Chapter 2 and Chapter 3
    3. Chapter 4
    4. Chapter 5
    5. Chapter 6
    6. Chapter 7
  18. B. Machine Learning project checklist
    1. Frame the problem and look at the big picture
    2. Get the data
    3. Explore the data
    4. Prepare the data
    5. Short-list promising models
    6. Fine-tune the system
    7. Present your solution
    8. Launch!
  19. C. Other popular ANN architectures
    1. Hopfield Nets
    2. Boltzmann Machines
    3. Restricted Boltzmann Machines
    4. Deep Belief Nets
    5. Self-Organizing Maps