You are previewing Mastering Machine Learning with R.
O'Reilly logo
Mastering Machine Learning with R

Book Description

Master machine learning techniques with R to deliver insights for complex projects

About This Book

  • Get to grips with the application of Machine Learning methods using an extensive set of R packages

  • Understand the benefits and potential pitfalls of using machine learning methods

  • Implement the numerous powerful features offered by R with this comprehensive guide to building an independent R-based ML system

  • Who This Book Is For

    If you want to learn how to use R's machine learning capabilities to solve complex business problems, then this book is for you. Some experience with R and a working knowledge of basic statistical or machine learning will prove helpful.

    What You Will Learn

  • Gain deep insights to learn the applications of machine learning tools to the industry

  • Manipulate data in R efficiently to prepare it for analysis

  • Master the skill of recognizing techniques for effective visualization of data

  • Understand why and how to create test and training data sets for analysis

  • Familiarize yourself with fundamental learning methods such as linear and logistic regression

  • Comprehend advanced learning methods such as support vector machines

  • Realize why and how to apply unsupervised learning methods

  • In Detail

    Machine learning is a field of Artificial Intelligence to build systems that learn from data. Given the growing prominence of R—a cross-platform, zero-cost statistical programming environment—there has never been a better time to start applying machine learning to your data.

    The book starts with introduction to Cross-Industry Standard Process for Data Mining. It takes you through Multivariate Regression in detail. Moving on, you will also address Classification and Regression trees. You will learn a couple of “Unsupervised techniques”. Finally, the book will walk you through text analysis and time series.

    The book will deliver practical and real-world solutions to problems and variety of tasks such as complex recommendation systems. By the end of this book, you will gain expertise in performing R machine learning and will be able to build complex ML projects using R and its packages.

    Style and approach

    This is a book explains complicated concepts with easy to follow theory and real-world, practical applications. It demonstrates the power of R and machine learning extensively while highlighting the constraints.

    Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

    Table of Contents

    1. Mastering Machine Learning with R
      1. Table of Contents
      2. Mastering Machine Learning with R
      3. Credits
      4. About the Author
      5. About the Reviewers
      6. www.PacktPub.com
        1. Support files, eBooks, discount offers, and more
          1. Why subscribe?
          2. Free access for Packt account holders
      7. Preface
        1. Machine learning defined
        2. Machine learning caveats
        3. Failure to engineer features
        4. Overfitting and underfitting
        5. Causality
        6. What this book covers
        7. What you need for this book
        8. Who this book is for
        9. Conventions
        10. Reader feedback
        11. Customer support
          1. Downloading the example code
          2. Downloading the color images of this book
          3. Errata
          4. Piracy
          5. eBooks, discount offers, and more
          6. Questions
      8. 1. A Process for Success
        1. The process
        2. Business understanding
          1. Identify the business objective
          2. Assess the situation
          3. Determine the analytical goals
          4. Produce a project plan
        3. Data understanding
        4. Data preparation
        5. Modeling
        6. Evaluation
        7. Deployment
        8. Algorithm flowchart
        9. Summary
      9. 2. Linear Regression – The Blocking and Tackling of Machine Learning
        1. Univariate linear regression
          1. Business understanding
        2. Multivariate linear regression
          1. Business understanding
          2. Data understanding and preparation
          3. Modeling and evaluation
        3. Other linear model considerations
          1. Qualitative feature
          2. Interaction term
        4. Summary
      10. 3. Logistic Regression and Discriminant Analysis
        1. Classification methods and linear regression
        2. Logistic regression
          1. Business understanding
          2. Data understanding and preparation
          3. Modeling and evaluation
            1. The logistic regression model
            2. Logistic regression with cross-validation
          4. Discriminant analysis overview
          5. Discriminant analysis application
        3. Model selection
        4. Summary
      11. 4. Advanced Feature Selection in Linear Models
        1. Regularization in a nutshell
          1. Ridge regression
          2. LASSO
          3. Elastic net
        2. Business case
          1. Business understanding
          2. Data understanding and preparation
        3. Modeling and evaluation
          1. Best subsets
          2. Ridge regression
          3. LASSO
          4. Elastic net
          5. Cross-validation with glmnet
        4. Model selection
        5. Summary
      12. 5. More Classification Techniques – K-Nearest Neighbors and Support Vector Machines
        1. K-Nearest Neighbors
        2. Support Vector Machines
        3. Business case
          1. Business understanding
          2. Data understanding and preparation
          3. Modeling and evaluation
            1. KNN modeling
            2. SVM modeling
          4. Model selection
        4. Feature selection for SVMs
        5. Summary
      13. 6. Classification and Regression Trees
        1. Introduction
        2. An overview of the techniques
          1. Regression trees
          2. Classification trees
          3. Random forest
          4. Gradient boosting
        3. Business case
          1. Modeling and evaluation
            1. Regression tree
            2. Classification tree
            3. Random forest regression
            4. Random forest classification
            5. Gradient boosting regression
            6. Gradient boosting classification
          2. Model selection
        4. Summary
      14. 7. Neural Networks
        1. Neural network
        2. Deep learning, a not-so-deep overview
        3. Business understanding
        4. Data understanding and preparation
        5. Modeling and evaluation
        6. An example of deep learning
          1. H2O background
          2. Data preparation and uploading it to H2O
          3. Create train and test datasets
          4. Modeling
        7. Summary
      15. 8. Cluster Analysis
        1. Hierarchical clustering
          1. Distance calculations
        2. K-means clustering
        3. Gower and partitioning around medoids
          1. Gower
          2. PAM
          3. Business understanding
        4. Data understanding and preparation
        5. Modeling and evaluation
          1. Hierarchical clustering
          2. K-means clustering
          3. Clustering with mixed data
        6. Summary
      16. 9. Principal Components Analysis
        1. An overview of the principal components
          1. Rotation
          2. Business understanding
          3. Data understanding and preparation
        2. Modeling and evaluation
          1. Component extraction
          2. Orthogonal rotation and interpretation
          3. Creating factor scores from the components
          4. Regression analysis
        3. Summary
      17. 10. Market Basket Analysis and Recommendation Engines
        1. An overview of a market basket analysis
        2. Business understanding
        3. Data understanding and preparation
        4. Modeling and evaluation
        5. An overview of a recommendation engine
          1. User-based collaborative filtering
          2. Item-based collaborative filtering
          3. Singular value decomposition and principal components analysis
        6. Business understanding and recommendations
        7. Data understanding, preparation, and recommendations
        8. Modeling, evaluation, and recommendations
        9. Summary
      18. 11. Time Series and Causality
        1. Univariate time series analysis
          1. Bivariate regression
          2. Granger causality
          3. Business understanding
          4. Data understanding and preparation
        2. Modeling and evaluation
          1. Univariate time series forecasting
          2. Time series regression
          3. Examining the causality
        3. Summary
      19. 12. Text Mining
        1. Text mining framework and methods
        2. Topic models
          1. Other quantitative analyses
          2. Business understanding
          3. Data understanding and preparation
        3. Modeling and evaluation
          1. Word frequency and topic models
          2. Additional quantitative analysis
        4. Summary
      20. A. R Fundamentals
        1. Introduction
        2. Getting R up and running
        3. Using R
        4. Data frames and matrices
        5. Summary stats
        6. Installing and loading the R packages
        7. Summary
      21. Index