Hands-On Machine Learning with Scikit-Learn and TensorFlow

Book description

None

Table of contents

  1. Preface
    1. The Machine Learning Tsunami
    2. Machine Learning in Your Projects
    3. Objective and Approach
    4. Prerequisites
    5. Roadmap
    6. Other Resources
    7. Conventions Used in This Book
    8. Using Code Examples
    9. O’Reilly Safari
    10. How to Contact Us
    11. Acknowledgments
  2. I. The Fundamentals of Machine Learning
  3. 1. The Machine Learning Landscape
    1. What Is Machine Learning?
    2. Why Use Machine Learning?
    3. Types of Machine Learning Systems
      1. Supervised/Unsupervised Learning
      2. Batch and Online Learning
      3. Instance-Based Versus Model-Based Learning
    4. Main Challenges of Machine Learning
      1. Insufficient Quantity of Training Data
      2. Nonrepresentative Training Data
      3. Poor-Quality Data
      4. Irrelevant Features
      5. Overfitting the Training Data
      6. Underfitting the Training Data
      7. Stepping Back
    5. Testing and Validating
    6. Exercises
  4. 2. End-to-End Machine Learning Project
    1. Working with Real Data
    2. Look at the Big Picture
      1. Frame the Problem
      2. Select a Performance Measure
      3. Check the Assumptions
    3. Get the Data
      1. Create the Workspace
      2. Download the Data
      3. Take a Quick Look at the Data Structure
      4. Create a Test Set
    4. Discover and Visualize the Data to Gain Insights
      1. Visualizing Geographical Data
      2. Looking for Correlations
      3. Experimenting with Attribute Combinations
    5. Prepare the Data for Machine Learning Algorithms
      1. Data Cleaning
      2. Handling Text and Categorical Attributes
      3. Custom Transformers
      4. Feature Scaling
      5. Transformation Pipelines
    6. Select and Train a Model
      1. Training and Evaluating on the Training Set
      2. Better Evaluation Using Cross-Validation
    7. Fine-Tune Your Model
      1. Grid Search
      2. Randomized Search
      3. Ensemble Methods
      4. Analyze the Best Models and Their Errors
      5. Evaluate Your System on the Test Set
    8. Launch, Monitor, and Maintain Your System
    9. Try It Out!
    10. Exercises
  5. 3. Classification
    1. MNIST
    2. Training a Binary Classifier
    3. Performance Measures
      1. Measuring Accuracy Using Cross-Validation
      2. Confusion Matrix
      3. Precision and Recall
      4. Precision/Recall Tradeoff
      5. The ROC Curve
    4. Multiclass Classification
    5. Error Analysis
    6. Multilabel Classification
    7. Multioutput Classification
    8. Exercises
  6. 4. Training Models
    1. Linear Regression
      1. The Normal Equation
      2. Computational Complexity
    2. Gradient Descent
      1. Batch Gradient Descent
      2. Stochastic Gradient Descent
      3. Mini-batch Gradient Descent
    3. Polynomial Regression
    4. Learning Curves
    5. Regularized Linear Models
      1. Ridge Regression
      2. Lasso Regression
      3. Elastic Net
      4. Early Stopping
    6. Logistic Regression
      1. Estimating Probabilities
      2. Training and Cost Function
      3. Decision Boundaries
      4. Softmax Regression
    7. Exercises
  7. 5. Support Vector Machines
    1. Linear SVM Classification
      1. Soft Margin Classification
    2. Nonlinear SVM Classification
      1. Polynomial Kernel
      2. Adding Similarity Features
      3. Gaussian RBF Kernel
      4. Computational Complexity
    3. SVM Regression
    4. Under the Hood
      1. Decision Function and Predictions
      2. Training Objective
      3. Quadratic Programming
      4. The Dual Problem
      5. Kernelized SVM
      6. Online SVMs
    5. Exercises
  8. 6. Decision Trees
    1. Training and Visualizing a Decision Tree
    2. Making Predictions
    3. Estimating Class Probabilities
    4. The CART Training Algorithm
    5. Computational Complexity
    6. Gini Impurity or Entropy?
    7. Regularization Hyperparameters
    8. Regression
    9. Instability
    10. Exercises
  9. 7. Ensemble Learning and Random Forests
    1. Voting Classifiers
    2. Bagging and Pasting
      1. Bagging and Pasting in Scikit-Learn
      2. Out-of-Bag Evaluation
    3. Random Patches and Random Subspaces
    4. Random Forests
      1. Extra-Trees
      2. Feature Importance
    5. Boosting
      1. AdaBoost
      2. Gradient Boosting
    6. Stacking
    7. Exercises
  10. 8. Dimensionality Reduction
    1. The Curse of Dimensionality
    2. Main Approaches for Dimensionality Reduction
      1. Projection
      2. Manifold Learning
    3. PCA
      1. Preserving the Variance
      2. Principal Components
      3. Projecting Down to d Dimensions
      4. Using Scikit-Learn
      5. Explained Variance Ratio
      6. Choosing the Right Number of Dimensions
      7. PCA for Compression
      8. Randomized PCA
      9. Incremental PCA
    4. Kernel PCA
      1. Selecting a Kernel and Tuning Hyperparameters
    5. LLE
    6. Other Dimensionality Reduction Techniques
    7. Exercises
  11. II. Neural Networks and Deep Learning
  12. 9. Up and Running with TensorFlow
    1. Installation
    2. Creating Your First Graph and Running It in a Session
    3. Managing Graphs
    4. Lifecycle of a Node Value
    5. Linear Regression with TensorFlow
    6. Implementing Gradient Descent
      1. Manually Computing the Gradients
      2. Using autodiff
      3. Using an Optimizer
    7. Feeding Data to the Training Algorithm
    8. Saving and Restoring Models
    9. Visualizing the Graph and Training Curves Using TensorBoard
    10. Name Scopes
    11. Modularity
    12. Sharing Variables
    13. Exercises
  13. 10. Introduction to Artificial Neural Networks
    1. From Biological to Artificial Neurons
      1. Biological Neurons
      2. Logical Computations with Neurons
      3. The Perceptron
      4. Multi-Layer Perceptron and Backpropagation
    2. Training an MLP with TensorFlow’s High-Level API
    3. Training a DNN Using Plain TensorFlow
      1. Construction Phase
      2. Execution Phase
      3. Using the Neural Network
    4. Fine-Tuning Neural Network Hyperparameters
      1. Number of Hidden Layers
      2. Number of Neurons per Hidden Layer
      3. Activation Functions
    5. Exercises
  14. 11. Training Deep Neural Nets
    1. Vanishing/Exploding Gradients Problems
      1. Xavier and He Initialization
      2. Nonsaturating Activation Functions
      3. Batch Normalization
      4. Gradient Clipping
    2. Reusing Pretrained Layers
      1. Reusing a TensorFlow Model
      2. Reusing Models from Other Frameworks
      3. Freezing the Lower Layers
      4. Caching the Frozen Layers
      5. Tweaking, Dropping, or Replacing the Upper Layers
      6. Model Zoos
      7. Unsupervised Pretraining
      8. Pretraining on an Auxiliary Task
    3. Faster Optimizers
      1. Momentum Optimization
      2. Nesterov Accelerated Gradient
      3. AdaGrad
      4. RMSProp
      5. Adam Optimization
      6. Learning Rate Scheduling
    4. Avoiding Overfitting Through Regularization
      1. Early Stopping
      2. ℓ1 and ℓ2 Regularization
      3. Dropout
      4. Max-Norm Regularization
      5. Data Augmentation
    5. Practical Guidelines
    6. Exercises
  15. 12. Distributing TensorFlow Across Devices and Servers
    1. Multiple Devices on a Single Machine
      1. Installation
      2. Managing the GPU RAM
      3. Placing Operations on Devices
      4. Parallel Execution
      5. Control Dependencies
    2. Multiple Devices Across Multiple Servers
      1. Opening a Session
      2. The Master and Worker Services
      3. Pinning Operations Across Tasks
      4. Sharding Variables Across Multiple Parameter Servers
      5. Sharing State Across Sessions Using Resource Containers
      6. Asynchronous Communication Using TensorFlow Queues
      7. Loading Data Directly from the Graph
    3. Parallelizing Neural Networks on a TensorFlow Cluster
      1. One Neural Network per Device
      2. In-Graph Versus Between-Graph Replication
      3. Model Parallelism
      4. Data Parallelism
    4. Exercises
  16. 13. Convolutional Neural Networks
    1. The Architecture of the Visual Cortex
    2. Convolutional Layer
      1. Filters
      2. Stacking Multiple Feature Maps
      3. TensorFlow Implementation
      4. Memory Requirements
    3. Pooling Layer
    4. CNN Architectures
      1. LeNet-5
      2. AlexNet
      3. GoogLeNet
      4. ResNet
    5. Exercises
  17. 14. Recurrent Neural Networks
    1. Recurrent Neurons
      1. Memory Cells
      2. Input and Output Sequences
    2. Basic RNNs in TensorFlow
      1. Static Unrolling Through Time
      2. Dynamic Unrolling Through Time
      3. Handling Variable Length Input Sequences
      4. Handling Variable-Length Output Sequences
    3. Training RNNs
      1. Training a Sequence Classifier
      2. Training to Predict Time Series
      3. Creative RNN
    4. Deep RNNs
      1. Distributing a Deep RNN Across Multiple GPUs
      2. Applying Dropout
      3. The Difficulty of Training over Many Time Steps
    5. LSTM Cell
      1. Peephole Connections
    6. GRU Cell
    7. Natural Language Processing
      1. Word Embeddings
      2. An Encoder–Decoder Network for Machine Translation
    8. Exercises
  18. 15. Autoencoders
    1. Efficient Data Representations
    2. Performing PCA with an Undercomplete Linear Autoencoder
    3. Stacked Autoencoders
      1. TensorFlow Implementation
      2. Tying Weights
      3. Training One Autoencoder at a Time
      4. Visualizing the Reconstructions
      5. Visualizing Features
    4. Unsupervised Pretraining Using Stacked Autoencoders
    5. Denoising Autoencoders
      1. TensorFlow Implementation
    6. Sparse Autoencoders
      1. TensorFlow Implementation
    7. Variational Autoencoders
      1. Generating Digits
    8. Other Autoencoders
    9. Exercises
  19. 16. Reinforcement Learning
    1. Learning to Optimize Rewards
    2. Policy Search
    3. Introduction to OpenAI Gym
    4. Neural Network Policies
    5. Evaluating Actions: The Credit Assignment Problem
    6. Policy Gradients
    7. Markov Decision Processes
    8. Temporal Difference Learning and Q-Learning
      1. Exploration Policies
      2. Approximate Q-Learning and Deep Q-Learning
    9. Learning to Play Ms. Pac-Man Using the DQN Algorithm
    10. Exercises
    11. Thank You!
  20. A. Exercise Solutions
    1. Chapter 1: The Machine Learning Landscape
    2. Chapter 2: End-to-End Machine Learning Project
    3. Chapter 3: Classification
    4. Chapter 4: Training Models
    5. Chapter 5: Support Vector Machines
    6. Chapter 6: Decision Trees
    7. Chapter 7: Ensemble Learning and Random Forests
    8. Chapter 8: Dimensionality Reduction
    9. Chapter 9: Up and Running with TensorFlow
    10. Chapter 10: Introduction to Artificial Neural Networks
    11. Chapter 11: Training Deep Neural Nets
    12. Chapter 12: Distributing TensorFlow Across Devices and Servers
    13. Chapter 13: Convolutional Neural Networks
    14. Chapter 14: Recurrent Neural Networks
    15. Chapter 15: Autoencoders
    16. Chapter 16: Reinforcement Learning
  21. B. Machine Learning Project Checklist
    1. Frame the Problem and Look at the Big Picture
    2. Get the Data
    3. Explore the Data
    4. Prepare the Data
    5. Short-List Promising Models
    6. Fine-Tune the System
    7. Present Your Solution
    8. Launch!
  22. C. SVM Dual Problem
  23. D. Autodiff
    1. Manual Differentiation
    2. Symbolic Differentiation
    3. Numerical Differentiation
    4. Forward-Mode Autodiff
    5. Reverse-Mode Autodiff
  24. E. Other Popular ANN Architectures
    1. Hopfield Networks
    2. Boltzmann Machines
    3. Restricted Boltzmann Machines
    4. Deep Belief Nets
    5. Self-Organizing Maps
  25. Index

Product information

  • Title: Hands-On Machine Learning with Scikit-Learn and TensorFlow
  • Author(s):
  • Release date:
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: None