O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Deep Learning

Book Description

Looking for one central source where you can learn key findings on machine learning? Deep Learning: A Practitioner's Approach provides developers and data scientists with the most practical information available on the subject, including deep learning theory, best practices, and use cases.

Table of Contents

  1. Preface
    1. What’s in This Book?
    2. Who is “The Practitioner”?
    3. Who Should Read This Book?
      1. The Enterprise Machine Learning Practitioner
      2. The Enterprise Executive
      3. The Academic
    4. Conventions Used in This Book
    5. Using Code Examples
    6. Administrative Notes
    7. O’Reilly Safari
    8. How to Contact Us
    9. Acknowledgements
      1. Josh’s Acknowledgements
      2. Adam’s Acknowledgements
  2. 1. A Review of Machine Learning
    1. The Learning Machines
      1. How Can Machines Learn?
      2. Biological Inspiration
      3. What is Deep Learning?
      4. Going Down the Rabbit Hole
    2. Framing the Questions
    3. The Math behind Machine Learning: Linear Algebra
      1. Scalars
      2. Vectors
      3. Matrices
      4. Tensors
      5. Hyperplanes
      6. Relevant Mathematical Operations
      7. Converting Data Into Vectors
      8. Solving Systems of Equations
    4. The Math Behind Machine Learning: Statistics
      1. Probability
      2. Conditional Probabilities
      3. Posterior Probability
      4. Distributions
      5. Samples vs Population
      6. Resampling Methods
      7. Selection Bias
      8. Likelihood
    5. How Does Machine Learning Work?
      1. Regression
      2. Classification
      3. Clustering
      4. Underfitting and Overfitting
      5. Optimization
      6. Convex Optimization
      7. Gradient Descent
      8. Stochastic Gradient Descent
      9. Quasi-Newton Optimization Methods
      10. Generative vs Discriminative Models
    6. Logistic Regression
      1. The Logistic Function
      2. Understanding Logistic Regression Output
    7. Evaluating Models
      1. The Confusion Matrix
    8. Building an Understanding of Machine Learning
  3. 2. Foundations of Neural Networks and Deep Learning
    1. Neural Networks
      1. The Biological Neuron
      2. The Perceptron
      3. Multi-Layer Feed-Forward Networks
    2. Training Neural Networks
      1. Backpropagation Learning
    3. Activation Functions
      1. Linear
      2. Sigmoid
      3. Tanh
      4. Hard Tanh
      5. Softmax
      6. Rectified Linear
    4. Loss Functions
      1. Loss Function Notation
      2. Loss Functions for Regression
      3. Loss Functions for Classification
      4. Loss Functions for Reconstruction
    5. Hyperparameters
      1. Learning Rate
      2. Regularization
      3. Momentum
      4. Sparsity
  4. 3. Fundamentals of Deep Networks
    1. Defining Deep Learning
      1. What is Deep Learning?
      2. Organization of This Chapter
    2. Common Architectural Principals of Deep Networks
      1. Parameters
      2. Layers
      3. Activation Functions
      4. Loss Functions
      5. Optimization Algorithms
      6. Hyperparameters
      7. Summary
    3. Building Blocks of Deep Networks
      1. Restricted Boltzmann Machines
      2. Autoencoders
      3. Variational Autoencoders
  5. 4. Major Architectures of Deep Networks
    1. Unsupervised Pre-Trained Networks
      1. Deep Belief Networks
      2. Generative Adversarial Networks
    2. Convolutional Neural Networks
      1. Biological Inspiration
      2. Intuition
      3. Convolutional Network Architecture Overview
      4. Input Layers
      5. Convolutional Layers
      6. Pooling Layers
      7. Fully-Connected Layers
      8. Other Applications of Convolutional Networks
      9. Convolutional Network Architectures of Note
      10. Summary
    3. Recurrent Neural Networks
      1. Modeling the Time Dimension
      2. 3D Volumetric Input
      3. Why Not Markov Models?
      4. General Recurrent Network Architecture
      5. Long Short-Term Memory (LSTM) Networks
      6. Domain Specific Applications and Blended Networks
    4. Recursive Neural Networks
      1. Network Architecture
      2. Varieties of Recursive Neural Networks
      3. Applications of Recursive Neural Networks
    5. Summary and Discussion
      1. Will Deep Learning Make Other Algorithms Obsolete?
      2. Different Problems Have Different Best Methods
      3. When Do I Need Deep Learning?
  6. 5. Building Deep Networks
    1. Matching Deep Networks to the Right Problem
      1. Columnar Data and Multi-Layer Perceptrons
      2. Images and Convolutional Neural Networks
      3. Timeseries Sequences and Recurrent Neural Networks
      4. Using Hybrid Networks
    2. The DL4J Suite of Tools
      1. Vectorization and DataVec
      2. Runtimes and ND4J
    3. Basic Concepts of the DL4J API
      1. Loading and Saving Models
      2. Getting Input For the Model
      3. Setting Up Model Architecture
      4. Training and Evaluation
    4. Modeling CSV Data with Multi-Layer Perceptron Networks
      1. Setting Up Input Data
      2. Determining Network Architecture
      3. Training the Model
      4. Evaluating the Model
    5. Modeling Hand-Written Images with Convolutional Neural Networks
      1. Java Code Listing for LeNet Convolutional Network
      2. Loading and Vectorizing the Input Images
      3. Network Architecture for LeNet in DL4J
      4. Training the Convolutional Network
    6. Modeling Sequence Data with Recurrent Neural Networks
      1. Generating Shakespeare with LSTMs
      2. Classifying Sensor Timeseries Sequences with LSTMs
    7. Using Autoencoders for Anomaly Detection
      1. Java Code Listing for Autoencoder Example
      2. Setting Up Input Data
      3. Autoencoder Network Architecture and Training
      4. Evaluating the Model
    8. Using Variational Autoencoders to Reconstruct MNIST Digits
      1. Code Listing to Reconstruct MNIST Digits
      2. Examining the VAE Model
    9. Applications of Deep Learning in Natural Language Processing
      1. Learning Word Embeddings with Word2Vec
      2. Distributed Representations of Sentences with Paragraph Vectors
      3. Using Paragraph Vectors for Document Classification
  7. 6. Tuning Deep Networks
    1. Basic Concepts in Tuning Deep Networks
      1. An Intuition for Building Deep Networks
      2. Building the Intuition as a Step-by-Step Process
    2. Matching Input Data and Network Architectures
      1. Summary
    3. Relating Model Goal and Output Layers
      1. Regression Model Output Layer
      2. Classification Model Output Layer
    4. Working with Layer Count, Parameter Count, and Memory
      1. Feed-Forward Multi-Layer Networks
      2. Controlling Layer and Parameter Counts
      3. Estimating Network Memory Requirements
    5. Weight Initialization Strategies
    6. Using Activation Functions
      1. Summary Table for Activation Functions
    7. Applying Loss Functions
    8. Understanding Learning Rates
      1. Using the Ratio of Parameters to Updates
      2. Specific Recommendations for Learning Rates
    9. How Sparsity Affects Learning
    10. Applying Methods of Optimization
      1. Stochastic Gradient Descent Best Practices
    11. Leveraging Parallelization and GPUs for Faster Training
      1. Online Learning and Parallel Iterative Algorithms
      2. Parallelizing Stochastic Gradient Descent in DL4J
      3. GPUs
    12. Controlling Epochs and Mini-Batch Size
      1. Understanding Mini-Batch Size Tradeoffs
    13. How to Use Regularization
      1. Priors as Regularizers
      2. Max-Norm Regularization
      3. Dropout
      4. Other Topics in Regularization
    14. Working with Class Imbalance
      1. Methods for Sampling Classes
      2. Weighted Loss Functions
    15. Dealing with Overfitting
    16. Using Network Statistics from the Tuning UI
      1. Detecting Poor Weight Initialization
      2. Detecting Non-Shuffled Data
      3. Detecting Issues with Regularization
  8. 7. Tuning Specific Deep Network Architectures
    1. Convolutional Neural Networks
      1. Common Convolutional Architectural Patterns
      2. Configuring Convolutional Layers
      3. Configuring Pooling Layers
      4. Transfer Learning
    2. Recurrent Neural Networks
      1. Network Input Data and Input Layers
      2. Output Layers and RnnOutputLayer
      3. Training the Network
      4. Debugging Common Issues with LSTMs
      5. Padding and Masking
      6. Evaluation and Scoring With Masking
      7. Variants of Recurrent Network Architectures
    3. Restricted Boltzmann Machines
      1. Hidden Units and Modeling Available Information
      2. Leveraging Different Units
      3. Using Regularization with RBMs
    4. Deep Belief Networks
      1. Using Momentum
      2. Using Regularization
      3. Determining Hidden Unit Count
  9. 8. Vectorization
    1. Introduction to Vectorization in Machine Learning
      1. Why do We Need to Vectorize Data?
      2. Strategies For Dealing with Columnar Raw Data Attributes
      3. Feature Engineering and Normalization Techniques
    2. Leveraging DataVec for ETL and Vectorization
    3. Vectorizing Image Data
      1. Image Data Representation in DL4J
      2. Image Data and Vector Normalization with DataVec
    4. Working with Sequential Data in Vectorization
      1. Major Variations of Sequential Data Sources
      2. Vectorizing Sequential Data with DataVec
    5. Working with Text in Vectorization
      1. Bag of Words
      2. Term Frequency Inverse Document Frequency (TF-IDF)
      3. Word2vec and Vector Space Model Comparison
    6. Working with Graphs
  10. 9. Using Deep Learning and DL4J on Spark
    1. Introduction to Using DL4J with Spark and Hadoop
      1. Operating Spark from the Command Line
    2. Configuring and Tuning Spark Execution
      1. Running Spark on Mesos
      2. Running Spark on YARN
      3. General Spark Tuning Guide
      4. Tuning DL4J Jobs on Spark
    3. Setting Up a Maven POM for Spark and DL4J
      1. A Pom.xml File Dependency Template
      2. Setting Up a POM File for CDH 5.X
      3. Setting Up a POM file for HDP 2.4
    4. Troubleshooting Spark and Hadoop
      1. Common Issues with ND4J
    5. DL4J Parallel Execution on Spark
      1. A Minimal Spark Training Example
    6. DL4J API Best Practices for Spark
    7. Multi-Layer Perceptron Spark Example
      1. Setting Up MLP Network Architecture for Spark
      2. Distributed Training and Model Evaluation
      3. Building and Executing a DL4J Spark Job
    8. Generating Shakespeare Text with Spark and LSTMs
      1. Setting Up the LSTM Network Architecture
      2. Training, Tracking Progress, and Understanding Results
    9. Modeling MNIST with a Convolutional Neural Network on Spark
      1. Configuring the Spark Job and Loading MNIST Data
      2. Setting Up the LeNet CNN Architecture and Training
  11. A. What is Artificial Intelligence?
    1. The Story So Far
      1. Defining Deep Learning
      2. Defining Artificial Intelligence
    2. What is Driving Interest Today in Artificial Intelligence?
    3. Winter Is Coming
  12. B. RL4J and Reinforcement Learning
    1. Preliminaries
      1. Markov Decision Process
      2. Terminology
    2. Different Settings
      1. Model-Free
      2. Observation Setting
      3. Single Player and Adversarial Games
    3. Q-Learning
      1. From Policy to Neural Network
      2. Policy Iteration
      3. Exploration vs Exploitation
      4. Bellman Equation
      5. Initial State Sampling
      6. Q-Learning Implementation
      7. Modeling Q(s, a)
      8. Experience Replay
      9. Convolutional Layers and Image Preprocessing
      10. History Processing
      11. Double Q-Learning
      12. Clipping
      13. Scaling Rewards
      14. Prioritized Replay
    4. Graph, Visualization, and Mean-Q
    5. RL4J
    6. Conclusion
  13. C. Numbers Everyone Should Know
  14. D. Neural Networks and Backpropagation: A Mathematical Approach
    1. Introduction
    2. Backpropagation in a Multi-Layer Perceptron
  15. E. Using the ND4J API
    1. Design and Basic Usage
      1. Understanding NDArrays
      2. ND4J General Syntax
      3. The Basics of Working with NDArrays
      4. DataSet
    2. Creating Input Vectors
      1. Basics of Vector Creation
    3. Using MLLibUtil
      1. Converting From INDArray to MLLib Vector
      2. Converting from MLLib Vector to INDArray
    4. Making Model Predictions with DL4J
      1. Using the DL4J and ND4J Together
  16. F. Using DataVec
    1. Loading Data for Machine Learning
    2. Loading CSV Data for Multi-Layer Perceptrons
    3. Loading Image Data for Convolutional Neural Networks
    4. Loading Sequence Data for Recurrent Neural Networks
    5. Transforming Data: Data Wranging with DataVec
      1. DataVec Transforms: Key Concepts
      2. DataVec Transform Functionality: An Example
  17. G. Working with DL4J From Source
    1. Verifying Git is Installed
    2. Cloning Key DL4J Github Projects
    3. Downloading Source Via Zip File
    4. Using Maven to Build Source Code
  18. H. Setting Up DL4J Projects
    1. Creating a new DL4J Project
      1. Java
      2. Working with Maven
      3. Integrated Development Environments
    2. Setting Up Other Maven POMs
      1. ND4J and Maven
  19. I. Setting Up GPUs for DL4J Projects
    1. Switching Backends to GPU
      1. Picking a GPU
      2. Training on a Multiple GPU System
    2. CUDA on Different Platforms
      1. CUDA on Linux
      2. CUDA on Windows
      3. CUDA on OSX
    3. Monitoring GPU Performance
      1. Nvidia System Management Interface (SMI)
  20. J. Troubleshooting DL4J Installations
    1. Previous Installation
    2. Memory Errors When Installing From Source
    3. Older Versions of Maven
    4. Maven and Path Variables
    5. Bad JDK Versions
    6. C++ and Other Development Tools
    7. Windows and Include Paths
    8. Monitoring GPUs
    9. Using the JVisualVM
    10. Working with Clojure
    11. OSX and Float Support
    12. Fork-Join Bug in Java 7
    13. Precautions
      1. Other Local Repositories
      2. Check Maven Dependencies
      3. Reinstall Dependencies
      4. If All Else Fails
    14. Different Platforms
      1. OSX
      2. Windows
      3. Linux
  21. Index