O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Practical Machine Learning Cookbook

Book Description

Building Machine Learning applications with R

About This Book

  • Implement a wide range of algorithms and techniques for tackling complex data

  • Improve predictions and recommendations to have better levels of accuracy

  • Optimize performance of your machine-learning systems

  • Who This Book Is For

    This book is for analysts, statisticians, and data scientists with knowledge of fundamentals of machine learning and statistics, who need help in dealing with challenging scenarios faced every day of working in the field of machine learning and improving system performance and accuracy. It is assumed that as a reader you have a good understanding of mathematics. Working knowledge of R is expected.

    What You Will Learn

  • Get equipped with a deeper understanding of how to apply machine-learning techniques

  • Implement each of the advanced machine-learning techniques

  • Solve real-life problems that are encountered in order to make your applications produce improved results

  • Gain hands-on experience in problem solving for your machine-learning systems

  • Understand the methods of collecting data, preparing data for usage, training the model, evaluating the model’s performance, and improving the model’s performance

  • In Detail

    Machine learning has become the new black. The challenge in today’s world is the explosion of data from existing legacy data and incoming new structured and unstructured data. The complexity of discovering, understanding, performing analysis, and predicting outcomes on the data using machine learning algorithms is a challenge. This cookbook will help solve everyday challenges you face as a data scientist. The application of various data science techniques and on multiple data sets based on real-world challenges you face will help you appreciate a variety of techniques used in various situations.

    The first half of the book provides recipes on fairly complex machine-learning systems, where you’ll learn to explore new areas of applications of machine learning and improve its efficiency. That includes recipes on classifications, neural networks, unsupervised and supervised learning, deep learning, reinforcement learning, and more.

    The second half of the book focuses on three different machine learning case studies, all based on real-world data, and offers solutions and solves specific machine-learning issues in each one.

    Style and approach

    Following a cookbook approach, we’ll teach you how to solve everyday difficulties and struggles you encounter.

    Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

    Table of Contents

    1. Practical Machine Learning Cookbook
      1. Practical Machine Learning Cookbook
      2. Credits
      3. About the Author
      4. About the Reviewer
      5. www.PacktPub.com
        1. Why subscribe?
      6. Customer Feedback
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Sections
          1. Getting ready
          2. How to do it…
        5. Conventions
        6. Reader feedback
        7. Customer support
          1. Downloading the example code
          2. Downloading the color images of this book 
          3. Errata
          4. Piracy
          5. Questions
      8. 1. Introduction to Machine Learning
        1. What is machine learning?
        2. An overview of classification
        3. An overview of clustering
        4. An overview of supervised learning
        5. An overview of unsupervised learning
        6. An overview of reinforcement learning
        7. An overview of structured prediction
        8. An overview of neural networks
        9. An overview of deep learning
      9. 2. Classification
        1. Introduction
        2. Discriminant function analysis - geological measurements on brines from wells
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - transforming data
            3. Step 4 - training the model
            4. Step 5 - classifying the data
            5. Step 6 - evaluating the model
        3. Multinomial logistic regression - understanding program choices made by students
          1. Getting ready
            1. Step 1 - collecting data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - training the model
            3. Step 4 - testing the results of the model
            4. Step 5 - model improvement performance
        4. Tobit regression - measuring the students' academic aptitude
          1. Getting ready
            1. Step 1 - collecting data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - plotting data
            3. Step 4 - exploring relationships
            4. Step 5 - training the model
            5. Step 6 - testing the model
        5. Poisson regression - understanding species present in Galapagos Islands
          1. Getting ready
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - plotting data and testing empirical data
            3. Step 4 - rectifying discretization of the Poisson model
            4. Step 5 - training and evaluating the model using the link function
            5. Step 6 - revaluating using the Poisson model
            6. Step 7 - revaluating using the linear model
      10. 3. Clustering
        1. Introduction
        2. Hierarchical clustering - World Bank sample dataset
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - transforming data
            3. Step 4 - training and evaluating the model performance
            4. Step 5 - plotting the model
        3. Hierarchical clustering - Amazon rainforest burned between 1999-2010
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - transforming data
            3. Step 4 - training and evaluating model performance
            4. Step 5 - plotting the model
            5. Step 6 - improving model performance
        4. Hierarchical clustering - gene clustering
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - transforming data
            3. Step 4 - training the model
            4. Step 5 - plotting the model
        5. Binary clustering - math test
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - training and evaluating model performance
            3. Step 4 - plotting the model
            4. Step 5 - K-medoids clustering
        6. K-means clustering - European countries protein consumption
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - clustering
            3. Step 4 - improving the model
        7. K-means clustering - foodstuff
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - transforming data
            3. Step 4 - clustering
            4. Step 5 - visualizing the clusters
      11. 4. Model Selection and Regularization
        1. Introduction
        2. Shrinkage methods - calories burned per day
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - building the model
            3. Step 4 - improving the model
            4. Step 5 - comparing the model
        3. Dimension reduction methods - Delta's Aircraft Fleet
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - applying principal components analysis
            3. Step 4 - scaling the data
            4. Step 5 - visualizing in 3D plot
        4. Principal component analysis - understanding world cuisine
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - preparing data
            3. Step 4 - applying principal components analysis
      12. 5. Nonlinearity
        1. Generalized additive models - measuring the household income of New Zealand
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - setting up the data for the model
            3. Step 4 - building the model
        2. Smoothing splines - understanding cars and speed
          1. How to do it...
            1. Step 1 - exploring the data
            2. Step 2 - creating the model
            3. Step 3 - fitting the smooth curve model
            4. Step 4 - plotting the results
        3. Local regression - understanding drought warnings and impact
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - collecting and exploring data
            2. Step 3 - calculating the moving average
            3. Step 4 - calculating percentiles
            4. Step 5 - plotting results
      13. 6. Supervised Learning
        1. Introduction
        2. Decision tree learning - Advance Health Directive for patients with chest pain
          1. Getting ready
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - preparing the data
            3. Step 4 - training the model
            4. Step 5- improving the model
        3. Decision tree learning - income-based distribution of real estate values
          1. Getting ready
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - training the model
            3. Step 4 - comparing the predictions
            4. Step 5 - improving the model
        4. Decision tree learning - predicting the direction of stock movement
          1. Getting ready
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - calculating the indicators
            3. Step 4 - preparing variables to build datasets
            4. Step 5 - building the model
            5. Step 6 - improving the model
        5. Naive Bayes - predicting the direction of stock movement
          1. Getting ready
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - preparing variables to build datasets
            3. Step 4 - building the model
            4. Step 5 - creating data for a new, improved model
            5. Step 6 - improving the model
        6. Random forest - currency trading strategy
          1. Getting ready
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - preparing variables to build datasets
            3. Step 4 - building the model
        7. Support vector machine - currency trading strategy
          1. Getting ready
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - calculating the indicators
            3. Step 4 - preparing variables to build datasets
            4. Step 5 - building the model
        8. Stochastic gradient descent - adult income
          1. Getting ready
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - preparing the data
            3. Step 4 - building the model
            4. Step 5 - plotting the model
      14. 7. Unsupervised Learning
        1. Introduction
        2. Self-organizing map - visualizing of heatmaps
          1. How to do it...
            1. Step 1 - exploring data
            2. Step 2 - training the model
            3. Step 3 - plotting the model
        3. Vector quantization - image clustering
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - data cleaning
            3. Step 4 - visualizing cleaned data
            4. Step 5 - building the model and visualizing it
      15. 8. Reinforcement Learning
        1. Introduction
        2. Markov chains - the stocks regime switching model
          1. Getting ready
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - preparing the regression model
            3. Step 4 - preparing the Markov-switching model
            4. Step 5 - plotting the regime probabilities
            5. Step 6 - testing the Markov switching model
        3. Markov chains - the multi-channel attribution model
          1. Getting ready
          2. How to do it...
            1. Step 1 - preparing the dataset
            2. Step 2 - preparing the model
            3. Step 3 - plotting the Markov graph
            4. Step 4 - simulating the dataset of customer journeys
            5. Step 5 - preparing a transition matrix heat map for real data
        4. Markov chains - the car rental agency service
          1. How to do it...
            1. Step 1 - preparing the dataset
            2. Step 2 - preparing the model
            3. Step 3 - improving the model
        5. Continuous Markov chains - vehicle service at a gas station
          1. Getting ready
          2. How to do it...
            1. Step 1 - preparing the dataset
            2. Step 2 - computing the theoretical resolution
            3. Step 3 - verifying the convergence of a theoretical solution
            4. Step 4 - plotting the results
        6. Monte Carlo simulations - calibrated Hull and White short-rates
          1. Getting ready
            1. Step 1 - installing the packages and libraries
          2. How to do it...
            1. Step 2 - initializing the data and variables
            2. Step 3 - pricing the Bermudan swaptions
            3. Step 4 - constructing the spot term structure of interest rates
            4. Step 5 - simulating Hull-White short-rates
      16. 9. Structured Prediction
        1. Introduction
        2. Hidden Markov models - EUR and USD
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - turning data into a time series
            3. Step 4 - building the model
            4. Step 5 - displaying the results
        3. Hidden Markov models - regime detection
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - preparing the model
      17. 10. Neural Networks
        1. Introduction
        2. Modelling SP 500
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - calculating the indicators
            3. Step 4 - preparing data for model building
            4. Step 5 - building the model
        3. Measuring the unemployment rate
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - preparing and verifying the models
            3. Step 4 - forecasting and testing the accuracy of the models built
      18. 11. Deep Learning
        1. Introduction
        2. Recurrent neural networks - predicting periodic signals
          1. Getting ready...
          2. How to do it...
      19. 12. Case Study - Exploring World Bank Data
        1. Introduction
        2. Exploring World Bank data
          1. Getting ready...
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - downloading the data
            2. Step 3 - exploring data
            3. Step 4 - building the models
            4. Step 5 - plotting the models
      20. 13. Case Study - Pricing Reinsurance Contracts
        1. Introduction
        2. Pricing reinsurance contracts
          1. Getting ready...
            1. Step 1 - collecting and describing the data
          2. How to do it...
            1. Step 2 - exploring the data
            2. Step 3 - calculating the individual loss claims
            3. Step 4 - calculating the number of hurricanes
            4. Step 5 - building predictive models
            5. Step 6 - calculating the pure premium of the reinsurance contract
      21. 14. Case Study - Forecast of Electricity Consumption
        1. Introduction
          1. Getting ready
            1. Step 1 - collecting and describing data
          2. How to do it...
            1. Step 2 - exploring data
            2. Step 3 - time series - regression analysis
            3. Step 4 - time series - improving regression analysis
            4. Step 5 - building a forecasting model
            5. Step 6 - plotting the forecast for a year