You are previewing Scala for Machine Learning.
O'Reilly logo
Scala for Machine Learning

Book Description

Leverage Scala and Machine Learning to construct and study systems that can learn from data

In Detail

The discovery of information through data clustering and classification is becoming a key differentiator for competitive organizations. Machine learning applications are everywhere, from self-driving cars, engineering designs, biometrics, and trading strategies, to detection of genetic anomalies.

The book begins with an introduction to the functional capabilities of the Scala programming language that are critical to the creation of machine learning algorithms such as dependency injection and implicits.

Next, you'll learn about data preprocessing and filtering techniques. Following this, you'll move on to clustering and dimension reduction, Naïve Bayes, regression models, sequential data, regularization and kernelization, support vector machines, neural networks, generic algorithms, and re-enforcement learning. A review of the Akka framework and Apache Spark clusters concludes the tutorial.

What You Will Learn

  • Build dynamic workflows for scientific computing
  • Leverage open source libraries to extract patterns from time series
  • Write your own classification, clustering, or evolutionary algorithm
  • Perform relative performance tuning and evaluation of Spark
  • Master probabilistic models for sequential data
  • Experiment with advanced techniques such as regularization and kernelization
  • Solve big data problems with Scala parallel collections, Akka actors, and Apache Spark clusters
  • Apply key learning strategies to a technical analysis of financial markets
  • Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

    Table of Contents

    1. Scala for Machine Learning
      1. Table of Contents
      2. Scala for Machine Learning
      3. Credits
      4. About the Author
      5. About the Reviewers
      6. www.PacktPub.com
        1. Support files, eBooks, discount offers, and more
          1. Why subscribe?
          2. Free access for Packt account holders
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. Questions
      8. 1. Getting Started
        1. Mathematical notation for the curious
        2. Why machine learning?
          1. Classification
          2. Prediction
          3. Optimization
          4. Regression
        3. Why Scala?
          1. Abstraction
          2. Scalability
          3. Configurability
          4. Maintainability
          5. Computation on demand
        4. Model categorization
        5. Taxonomy of machine learning algorithms
          1. Unsupervised learning
            1. Clustering
            2. Dimension reduction
          2. Supervised learning
            1. Generative models
            2. Discriminative models
          3. Reinforcement learning
        6. Tools and frameworks
          1. Java
          2. Scala
          3. Apache Commons Math
            1. Description
            2. Licensing
            3. Installation
          4. JFreeChart
            1. Description
            2. Licensing
            3. Installation
          5. Other libraries and frameworks
        7. Source code
          1. Context versus view bounds
          2. Presentation
          3. Primitives and implicits
            1. Primitive types
            2. Type conversions
            3. Operators
          4. Immutability
          5. Performance of Scala iterators
        8. Let's kick the tires
          1. Overview of computational workflows
          2. Writing a simple workflow
            1. Selecting a dataset
            2. Loading the dataset
            3. Preprocessing the dataset
              1. Basic statistics
              2. Normalization and Gauss distribution
              3. Plotting data
            4. Creating a model (learning)
            5. Classify the data
        9. Summary
      9. 2. Hello World!
        1. Modeling
          1. A model by any other name
          2. Model versus design
          3. Selecting a model's features
          4. Extracting features
        2. Designing a workflow
          1. The computational framework
          2. The pipe operator
          3. Monadic data transformation
          4. Dependency injection
          5. Workflow modules
          6. The workflow factory
          7. Examples of workflow components
            1. The preprocessing module
            2. The clustering module
        3. Assessing a model
          1. Validation
            1. Key metrics
            2. Implementation
          2. K-fold cross-validation
          3. Bias-variance decomposition
          4. Overfitting
        4. Summary
      10. 3. Data Preprocessing
        1. Time series
        2. Moving averages
          1. The simple moving average
          2. The weighted moving average
          3. The exponential moving average
        3. Fourier analysis
          1. Discrete Fourier transform (DFT)
          2. DFT-based filtering
          3. Detection of market cycles
        4. The Kalman filter
          1. The state space estimation
            1. The transition equation
            2. The measurement equation
          2. The recursive algorithm
            1. Prediction
            2. Correction
            3. Kalman smoothing
            4. Experimentation
        5. Alternative preprocessing techniques
        6. Summary
      11. 4. Unsupervised Learning
        1. Clustering
          1. K-means clustering
            1. Measuring similarity
            2. Overview of the K-means algorithm
            3. Step 1 – cluster configuration
              1. Defining clusters
              2. Defining K-means
              3. Initializing clusters
            4. Step 2 – cluster assignment
            5. Step 3 – iterative reconstruction
            6. Curse of dimensionality
            7. Experiment
            8. Tuning the number of clusters
            9. Validation
          2. Expectation-maximization (EM) algorithm
            1. Gaussian mixture model
            2. EM overview
            3. Implementation
            4. Testing
            5. Online EM
        2. Dimension reduction
          1. Principal components analysis (PCA)
            1. Algorithm
            2. Implementation
            3. Test case
            4. Evaluation
          2. Other dimension reduction techniques
        3. Performance considerations
          1. K-means
          2. EM
          3. PCA
        4. Summary
      12. 5. Naïve Bayes Classifiers
        1. Probabilistic graphical models
        2. Naïve Bayes classifiers
          1. Introducing the multinomial Naïve Bayes
            1. Formalism
            2. The frequentist perspective
            3. The predictive model
            4. The zero-frequency problem
          2. Implementation
            1. Software design
            2. Training
            3. Classification
            4. Labeling
            5. Results
        3. Multivariate Bernoulli classification
          1. Model
          2. Implementation
        4. Naïve Bayes and text mining
          1. Basics of information retrieval
          2. Implementation
            1. Extraction of terms
            2. Scoring of terms
          3. Testing
            1. Retrieving textual information
            2. Evaluation
        5. Pros and cons
        6. Summary
      13. 6. Regression and Regularization
        1. Linear regression
          1. One-variate linear regression
            1. Implementation
            2. Test case
          2. Ordinary least squares (OLS) regression
            1. Design
            2. Implementation
            3. Test case 1 – trending
            4. Test case 2 – features selection
        2. Regularization
          1. Ln roughness penalty
          2. The ridge regression
            1. Implementation
            2. The test case
        3. Numerical optimization
        4. The logistic regression
          1. The logit function
          2. Binomial classification
          3. Software design
          4. The training workflow
            1. Configuring the least squares optimizer
            2. Computing the Jacobian matrix
            3. Defining the exit conditions
            4. Defining the least squares problem
            5. Minimizing the loss function
            6. Test
          5. Classification
        5. Summary
      14. 7. Sequential Data Models
        1. Markov decision processes
          1. The Markov property
          2. The first-order discrete Markov chain
        2. The hidden Markov model (HMM)
          1. Notation
          2. The lambda model
          3. HMM execution state
          4. Evaluation (CF-1)
            1. Alpha class (the forward variable)
            2. Beta class (the backward variable)
          5. Training (CF-2)
            1. Baum-Welch estimator (EM)
          6. Decoding (CF-3)
            1. The Viterbi algorithm
          7. Putting it all together
          8. Test case
          9. The hidden Markov model for time series analysis
        3. Conditional random fields
          1. Introduction to CRF
          2. Linear chain CRF
        4. CRF and text analytics
          1. The feature functions model
          2. Software design
          3. Implementation
            1. Building the training set
            2. Generating tags
            3. Extracting data sequences
            4. CRF control parameters
            5. Putting it all together
          4. Tests
            1. The training convergence profile
            2. Impact of the size of the training set
            3. Impact of the L2 regularization factor
        5. Comparing CRF and HMM
        6. Performance consideration
        7. Summary
      15. 8. Kernel Models and Support Vector Machines
        1. Kernel functions
          1. Overview
          2. Common discriminative kernels
        2. The support vector machine (SVM)
          1. The linear SVM
            1. The separable case (hard margin)
            2. The nonseparable case (soft margin)
          2. The nonlinear SVM
            1. Max-margin classification
            2. The kernel trick
        3. Support vector classifier (SVC)
          1. The binary SVC
            1. LIBSVM
            2. Software design
            3. Configuration parameters
              1. SVM Formulation
              2. The SVM kernel function
              3. SVM execution
            4. SVM implementation
            5. C-penalty and margin
            6. Kernel evaluation
            7. Application to risk analysis
              1. Features and labels
        4. Anomaly detection with one-class SVC
        5. Support vector regression (SVR)
          1. Overview
          2. SVR versus linear regression
        6. Performance considerations
        7. Summary
      16. 9. Artificial Neural Networks
        1. Feed-forward neural networks (FFNN)
          1. The Biological background
          2. The mathematical background
        2. The multilayer perceptron (MLP)
          1. The activation function
          2. The network architecture
          3. Software design
          4. Model definition
            1. Layers
            2. Synapses
            3. Connections
          5. Training cycle/epoch
            1. Step 1 – input forward propagation
              1. The computational model
              2. Objective
              3. Softmax
            2. Step 2 – sum of squared errors
            3. Step 3 – error backpropagation
              1. Error propagation
              2. The computational model
            4. Step 4 – synapse/weights adjustment
              1. Momentum factor for gradient descent
              2. Implementation
            5. Step 5 – convergence criteria
            6. Configuration
            7. Putting all together
          6. Training strategies and classification
            1. Online versus batch training
            2. Regularization
            3. Model instantiation
            4. Prediction
        3. Evaluation
          1. Impact of learning rate
          2. Impact of the momentum factor
          3. Test case
            1. Implementation
            2. Models evaluation
            3. Impact of hidden layers architecture
        4. Benefits and limitations
        5. Summary
      17. 10. Genetic Algorithms
        1. Evolution
          1. The origin
          2. NP problems
          3. Evolutionary computing
        2. Genetic algorithms and machine learning
        3. Genetic algorithm components
          1. Encodings
            1. Value encoding
            2. Predicate encoding
            3. Solution encoding
            4. The encoding scheme
              1. Flat encoding
              2. Hierarchical encoding
          2. Genetic operators
            1. Selection
            2. Crossover
            3. Mutation
          3. Fitness score
        4. Implementation
          1. Software design
          2. Key components
          3. Selection
          4. Controlling population growth
          5. GA configuration
          6. Crossover
            1. Population
            2. Chromosomes
            3. Genes
          7. Mutation
            1. Population
            2. Chromosomes
            3. Genes
          8. The reproduction cycle
        5. GA for trading strategies
          1. Definition of trading strategies
            1. Trading operators
            2. The cost/unfitness function
            3. Trading signals
            4. Trading strategies
            5. Signal encoding
          2. Test case
            1. Data extraction
            2. Initial population
            3. Configuration
            4. GA instantiation
            5. GA execution
            6. Tests
              1. The unweighted score
              2. The weighted score
        6. Advantages and risks of genetic algorithms
        7. Summary
      18. 11. Reinforcement Learning
        1. Introduction
          1. The problem
          2. A solution – Q-learning
            1. Terminology
            2. Concept
            3. Value of policy
            4. Bellman optimality equations
            5. Temporal difference for model-free learning
            6. Action-value iterative update
          3. Implementation
            1. Software design
            2. States and actions
            3. Search space
            4. Policy and action-value
            5. The Q-learning training
            6. Tail recursion to the rescue
            7. Prediction
          4. Option trading using Q-learning
            1. Option property
            2. Option model
            3. Function approximation
            4. Constrained state-transition
            5. Putting it all together
          5. Evaluation
          6. Pros and cons of reinforcement learning
        2. Learning classifier systems
          1. Introduction to LCS
          2. Why LCS
          3. Terminology
          4. Extended learning classifier systems (XCS)
          5. XCS components
            1. Application to portfolio management
            2. XCS core data
            3. XCS rules
            4. Covering
            5. Example of implementation
          6. Benefits and limitation of learning classifier systems
        3. Summary
      19. 12. Scalable Frameworks
        1. Overview
        2. Scala
          1. Controlling object creation
          2. Parallel collections
            1. Processing a parallel collection
            2. Benchmark framework
            3. Performance evaluation
        3. Scalability with Actors
          1. The Actor model
          2. Partitioning
          3. Beyond actors – reactive programming
        4. Akka
          1. Master-workers
            1. Messages exchange
            2. Worker actors
            3. The workflow controller
            4. The master Actor
            5. Master with routing
            6. Distributed discrete Fourier transform
            7. Limitations
          2. Futures
            1. The Actor life cycle
            2. Blocking on futures
            3. Handling future callbacks
            4. Putting all together
        5. Apache Spark
          1. Why Spark
          2. Design principles
            1. In-memory persistency
            2. Laziness
            3. Transforms and Actions
            4. Shared variables
          3. Experimenting with Spark
            1. Deploying Spark
            2. Using Spark shell
            3. MLlib
            4. RDD generation
            5. K-means using Spark
          4. Performance evaluation
            1. Tuning parameters
            2. Tests
            3. Performance considerations
          5. Pros and cons
          6. 0xdata Sparkling Water
        6. Summary
      20. A. Basic Concepts
        1. Scala programming
          1. List of libraries
          2. Format of code snippets
          3. Encapsulation
          4. Class constructor template
          5. Companion objects versus case classes
          6. Enumerations versus case classes
          7. Overloading
          8. Design template for classifiers
          9. Data extraction
          10. Data sources
          11. Extraction of documents
          12. Matrix class
        2. Mathematics
          1. Linear algebra
            1. QR Decomposition
            2. LU factorization
            3. LDL decomposition
            4. Cholesky factorization
            5. Singular value decomposition
            6. Eigenvalue decomposition
            7. Algebraic and numerical libraries
          2. First order predicate logic
          3. Jacobian and Hessian matrices
          4. Summary of optimization techniques
            1. Gradient descent methods
              1. Steepest descent
              2. Conjugate gradient
              3. Stochastic gradient descent
            2. Quasi-Newton algorithms
              1. BFGS
              2. L-BFGS
            3. Nonlinear least squares minimization
              1. Gauss-Newton
              2. Levenberg-Marquardt
            4. Lagrange multipliers
          5. Overview of dynamic programming
        3. Finances 101
          1. Fundamental analysis
          2. Technical analysis
            1. Terminology
            2. Trading signals and strategy
            3. Price patterns
          3. Options trading
          4. Financial data sources
        4. Suggested online courses
        5. References
      21. Index