You are previewing R: Recipes for Analysis, Visualization and Machine Learning.
O'Reilly logo
R: Recipes for Analysis, Visualization and Machine Learning

Book Description

Get savvy with R language and actualize projects aimed at analysis, visualization and machine learning

About This Book

  • Proficiently analyze data and apply machine learning techniques

  • Generate visualizations, develop interactive visualizations and applications to understand various data exploratory functions in R

  • Construct a predictive model by using a variety of machine learning packages

  • Who This Book Is For

    This Learning Path is ideal for those who have been exposed to R, but have not used it extensively yet. It covers the basics of using R and is written for new and intermediate R users interested in learning. This Learning Path also provides in-depth insights into professional techniques for analysis, visualization, and machine learning with R – it will help you increase your R expertise, regardless of your level of experience.

    What You Will Learn

  • Get data into your R environment and prepare it for analysis

  • Perform exploratory data analyses and generate meaningful visualizations of the data

  • Generate various plots in R using the basic R plotting techniques

  • Create presentations and learn the basics of creating apps in R for your audience

  • Create and inspect the transaction dataset, performing association analysis with the Apriori algorithm

  • Visualize associations in various graph formats and find frequent itemset using the ECLAT algorithm

  • Build, tune, and evaluate predictive models with different machine learning packages

  • Incorporate R and Hadoop to solve machine learning problems on big data

  • In Detail

    The R language is a powerful, open source, functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics. This Learning Path is chock-full of recipes. Literally! It aims to excite you with awesome projects focused on analysis, visualization, and machine learning. We’ll start off with data analysis – this will show you ways to use R to generate professional analysis reports. We’ll then move on to visualizing our data – this provides you with all the guidance needed to get comfortable with data visualization with R. Finally, we’ll move into the world of machine learning – this introduces you to data classification, regression, clustering, association rule mining, and dimension reduction.

    This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:

  • R Data Analysis Cookbook by Viswa Viswanathan and Shanthi Viswanathan

  • R Data Visualization Cookbook by Atmajitsinh Gohil

  • Machine Learning with R Cookbook by Yu-Wei, Chiu (David Chiu)

  • Style and approach

    This course creates a smooth learning path that will teach you how to analyze data and create stunning visualizations. The step-by-step instructions provided for each recipe in this comprehensive Learning Path will show you how to create machine learning projects with R.

    Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

    Table of Contents

    1. R: Recipes for Analysis, Visualization and Machine Learning
      1. Table of Contents
      2. R: Recipes for Analysis, Visualization and Machine Learning
      3. R: Recipes for Analysis, Visualization and Machine Learning
      4. Credits
      5. Preface
        1. What this learning path covers
        2. What you need for this learning path
        3. Who this learning path is for
        4. Reader feedback
        5. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. Questions
      6. 1. Module 1
        1. 1. A Simple Guide to R
          1. Installing packages and getting help in R
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          2. Data types in R
            1. How to do it…
          3. Special values in R
            1. How to do it…
            2. How it works…
          4. Matrices in R
            1. How to do it…
            2. How it works…
          5. Editing a matrix in R
            1. How to do it…
          6. Data frames in R
            1. How to do it…
          7. Editing a data frame in R
            1. How to do it...
          8. Importing data in R
            1. How to do it...
            2. How it works…
          9. Exporting data in R
            1. How to do it…
            2. How it works…
          10. Writing a function in R
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          11. Writing if else statements in R
            1. How to do it…
            2. How it works…
          12. Basic loops in R
            1. How to do it…
            2. How it works…
          13. Nested loops in R
            1. How to do it…
          14. The apply, lapply, sapply, and tapply functions
            1. How to do it…
            2. How it works…
          15. Using par to beautify a plot in R
            1. How to do it…
            2. How it works…
          16. Saving plots
            1. How to do it…
            2. How it works…
        2. 2. Practical Machine Learning with R
          1. Introduction
          2. Downloading and installing R
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          3. Downloading and installing RStudio
            1. Getting ready
            2. How to do it...
            3. How it works
            4. See also
          4. Installing and loading packages
            1. Getting ready
            2. How to do it...
            3. How it works
            4. See also
          5. Reading and writing data
            1. Getting ready
            2. How to do it...
            3. How it works
            4. See also
          6. Using R to manipulate data
            1. Getting ready
            2. How to do it...
            3. How it works
            4. There's more...
          7. Applying basic statistics
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          8. Visualizing data
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Getting a dataset for machine learning
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        3. 3. Acquire and Prepare the Ingredients – Your Data
          1. Introduction
          2. Reading data from CSV files
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Handling different column delimiters
              2. Handling column headers/variable names
              3. Handling missing values
              4. Reading strings as characters and not as factors
              5. Reading data directly from a website
          3. Reading XML data
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Extracting HTML table data from a web page
              2. Extracting a single HTML table from a web page
          4. Reading JSON data
            1. Getting ready
            2. How to do it...
            3. How it works...
          5. Reading data from fixed-width formatted files
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Files with headers
              2. Excluding columns from data
          6. Reading data from R files and R libraries
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. To save all objects in a session
              2. To selectively save objects in a session
              3. Attaching/detaching R data files to an environment
              4. Listing all datasets in loaded packages
          7. Removing cases with missing values
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Eliminating cases with NA for selected variables
              2. Finding cases that have no missing values
              3. Converting specific values to NA
              4. Excluding NA values from computations
          8. Replacing missing values with the mean
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Imputing random values sampled from nonmissing values
          9. Removing duplicate cases
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Identifying duplicates (without deleting them)
          10. Rescaling a variable to [0,1]
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Rescaling many variables at once
            5. See also…
          11. Normalizing or standardizing data in a data frame
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Standardizing several variables simultaneously
            5. See also…
          12. Binning numerical data
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Creating a specified number of intervals automatically
          13. Creating dummies for categorical variables
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Choosing which variables to create dummies for
        4. 4. What's in There? – Exploratory Data Analysis
          1. Introduction
          2. Creating standard data summaries
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Using the str() function for an overview of a data frame
              2. Computing the summary for a single variable
              3. Finding the mean and standard deviation
          3. Extracting a subset of a dataset
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Excluding columns
              2. Selecting based on multiple values
              3. Selecting using logical vector
          4. Splitting a dataset
            1. Getting ready
            2. How to do it...
            3. How it works...
          5. Creating random data partitions
            1. Getting ready
            2. How to do it…
              1. Case 1 – numerical target variable and two partitions
              2. Case 2 – numerical target variable and three partitions
              3. Case 3 – categorical target variable and two partitions
              4. Case 4 – categorical target variable and three partitions
            3. How it works...
            4. There's more...
              1. Using a convenience function for partitioning
              2. Sampling from a set of values
          6. Generating standard plots such as histograms, boxplots, and scatterplots
            1. Getting ready
            2. How to do it...
              1. Histograms
              2. Boxplots
              3. Scatterplots
              4. Scatterplot matrices
            3. How it works...
              1. Histograms
              2. Boxplots
            4. There's more...
              1. Overlay a density plot on a histogram
              2. Overlay a regression line on a scatterplot
              3. Color specific points on a scatterplot
          7. Generating multiple plots on a grid
            1. Getting ready
            2. How to do it...
            3. How it works...
              1. Graphics parameters
            4. See also…
          8. Selecting a graphics device
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also…
          9. Creating plots with the lattice package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Adding flair to your graphs
            5. See also…
          10. Creating plots with the ggplot2 package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Graph using qplot
              2. Condition plots on continuous numeric variables
            5. See also…
          11. Creating charts that facilitate comparisons
            1. Getting ready
            2. How to do it...
              1. Using base plotting system
              2. Using ggplot2
            3. How it works...
            4. There's more...
              1. Creating boxplots with ggplot2
            5. See also…
          12. Creating charts that help visualize a possible causality
            1. Getting ready
            2. How to do it...
            3. See also…
          13. Creating multivariate plots
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also…
        5. 5. Where Does It Belong? – Classification
          1. Introduction
          2. Generating error/classification-confusion matrices
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Visualizing the error/classification confusion matrix
              2. Comparing the model's performance for different classes
          3. Generating ROC charts
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more…
              1. Using arbitrary class labels
          4. Building, plotting, and evaluating – classification trees
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Computing raw probabilities
              2. Create the ROC Chart
            5. See also
          5. Using random forest models for classification
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Computing raw probabilities
              2. Generating the ROC chart
              3. Specifying cutoffs for classification
            5. See also...
          6. Classifying using Support Vector Machine
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Controlling scaling of variables
              2. Determining the type of SVM model
              3. Assigning weights to the classes
            5. See also...
          7. Classifying using the Naïve Bayes approach
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          8. Classifying using the KNN approach
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Automating the process of running KNN for many k values
              2. Using KNN to compute raw probabilities instead of classifications
          9. Using neural networks for classification
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Exercising greater control over nnet
              2. Generating raw probabilities
          10. Classifying using linear discriminant function analysis
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Using the formula interface for lda
            5. See also ...
          11. Classifying using logistic regression
            1. Getting ready
            2. How to do it...
            3. How it works...
          12. Using AdaBoost to combine classification tree models
            1. Getting ready
            2. How to do it...
            3. How it works...
        6. 6. Give Me a Number – Regression
          1. Introduction
          2. Computing the root mean squared error
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Using a convenience function to compute the RMS error
          3. Building KNN models for regression
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Running KNN with cross-validation in place of validation partition
              2. Using a convenience function to run KNN
              3. Using a convenience function to run KNN for multiple k values
            5. See also...
          4. Performing linear regression
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Forcing lm to use a specific factor level as the reference
              2. Using other options in the formula expression for linear models
            5. See also...
          5. Performing variable selection in linear regression
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          6. Building regression trees
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more…
              1. Generating regression trees for data with categorical predictors
            5. See also...
          7. Building random forest models for regression
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Controlling forest generation
            5. See also...
          8. Using neural networks for regression
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          9. Performing k-fold cross-validation
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          10. Performing leave-one-out-cross-validation to limit overfitting
            1. How to do it...
            2. How it works...
            3. See also...
        7. 7. Can You Simplify That? – Data Reduction Techniques
          1. Introduction
          2. Performing cluster analysis using K-means clustering
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Use a convenience function to choose a value for K
            5. See also...
          3. Performing cluster analysis using hierarchical clustering
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          4. Reducing dimensionality with principal component analysis
            1. Getting ready
            2. How to do it...
            3. How it works...
        8. 8. Lessons from History – Time Series Analysis
          1. Introduction
          2. Creating and examining date objects
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          3. Operating on date objects
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          4. Performing preliminary analyses on time series data
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          5. Using time series objects
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          6. Decomposing time series
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          7. Filtering time series data
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          8. Smoothing and forecasting using the Holt-Winters method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          9. Building an automated ARIMA model
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
        9. 9. It's All About Your Connections – Social Network Analysis
          1. Introduction
          2. Downloading social network data using public APIs
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          3. Creating adjacency matrices and edge lists
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also...
          4. Plotting social network data
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Specifying plotting preferences
              2. Plotting directed graphs
              3. Creating a graph object with weights
              4. Extracting the network as an adjacency matrix from the graph object
              5. Extracting an adjacency matrix with weights
              6. Extracting edge list from graph object
              7. Creating bipartite network graph
              8. Generating projections of a bipartite network
            5. See also...
          5. Computing important network metrics
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Getting edge sequences
              2. Getting immediate and distant neighbors
              3. Adding vertices or nodes
              4. Adding edges
              5. Deleting isolates from a graph
              6. Creating subgraphs
        10. 10. Put Your Best Foot Forward – Document and Present Your Analysis
          1. Introduction
          2. Generating reports of your data analysis with R Markdown and knitr
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Using the render function
              2. Adding output options
          3. Creating interactive web applications with shiny
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Adding images
              2. Adding HTML
              3. Adding tab sets
              4. Adding a dynamic UI
              5. Creating single file web application
          4. Creating PDF presentations of your analysis with R Presentation
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Using hyperlinks
              2. Controlling the display
              3. Enhancing the look of the presentation
        11. 11. Work Smarter, Not Harder – Efficient and Elegant R Code
          1. Introduction
          2. Exploiting vectorized operations
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          3. Processing entire rows or columns using the apply function
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Using apply on a three-dimensional array
          4. Applying a function to all elements of a collection with lapply and sapply
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Dynamic output
              2. One caution
          5. Applying functions to subsets of a vector
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Applying a function on groups from a data frame
          6. Using the split-apply-combine strategy with plyr
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Adding a new column using transform
              2. Using summarize along with the plyr function
              3. Concatenating the list of data frames into a big data frame
          7. Slicing, dicing, and combining data with data tables
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Adding multiple aggregated columns
              2. Counting groups
              3. Deleting a column
              4. Joining data tables
              5. Using symbols
        12. 12. Where in the World? – Geospatial Analysis
          1. Introduction
          2. Downloading and plotting a Google map of an area
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Saving the downloaded map as an image file
              2. Getting a satellite image
          3. Overlaying data on the downloaded Google map
            1. Getting ready
            2. How to do it...
            3. How it works...
          4. Importing ESRI shape files into R
            1. Getting ready
            2. How to do it...
            3. How it works...
          5. Using the sp package to plot geographic data
            1. Getting ready
            2. How to do it...
            3. How it works...
          6. Getting maps from the maps package
            1. Getting ready
            2. How to do it...
            3. How it works...
          7. Creating spatial data frames from regular data frames containing spatial and other data
            1. Getting ready
            2. How to do it...
            3. How it works...
          8. Creating spatial data frames by combining regular data frames with spatial objects
            1. Getting ready
            2. How to do it...
            3. How it works...
          9. Adding variables to an existing spatial data frame
            1. Getting ready
            2. How to do it...
            3. How it works...
        13. 13. Playing Nice – Connecting to Other Systems
          1. Introduction
          2. Using Java objects in R
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Checking JVM properties
              2. Displaying available methods
          3. Using JRI to call R functions from Java
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          4. Using Rserve to call R functions from Java
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Retrieving an array from R
          5. Executing R scripts from Java
            1. Getting ready
            2. How to do it...
            3. How it works...
          6. Using the xlsx package to connect to Excel
            1. Getting ready
            2. How to do it...
            3. How it works...
          7. Reading data from relational databases – MySQL
            1. Getting ready
            2. How to do it...
              1. Using RODBC
              2. Using RMySQL
              3. Using RJDBC
            3. How it works...
              1. Using RODBC
              2. Using RMySQL
              3. Using RJDBC
            4. There's more...
              1. Fetching all rows
              2. When the SQL query is long
          8. Reading data from NoSQL databases – MongoDB
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
              1. Validating your JSON
      7. 2. Module 2
        1. 1. Basic and Interactive Plots
          1. Introduction
          2. Introducing a scatter plot
            1. Getting ready
            2. How to do it…
            3. How it works…
          3. Scatter plots with texts, labels, and lines
            1. How to do it…
            2. How it works…
            3. There's more…
            4. See also
          4. Connecting points in a scatter plot
            1. How to do it…
            2. How it works…
            3. There's more…
            4. See also
          5. Generating an interactive scatter plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          6. A simple bar plot
            1. How to do it…
            2. How it works…
            3. There's more…
            4. See also
          7. An interactive bar plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          8. A simple line plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          9. Line plot to tell an effective story
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          10. Generating an interactive Gantt/timeline chart in R
            1. Getting ready
            2. How to do it…
            3. See also
          11. Merging histograms
            1. How to do it…
            2. How it works…
          12. Making an interactive bubble plot
            1. How to do it…
            2. How it works…
            3. There's more…
            4. See also
          13. Constructing a waterfall plot in R
            1. Getting ready
            2. How to do it…
        2. 2. Heat Maps and Dendrograms
          1. Introduction
          2. Constructing a simple dendrogram
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more...
            5. See also
          3. Creating dendrograms with colors and labels
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
          4. Creating a heat map
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          5. Generating a heat map with customized colors
            1. Getting ready
            2. How to do it…
            3. How it works…
          6. Generating an integrated dendrogram and a heat map
            1. How to do it…
            2. There's more…
            3. See also
          7. Creating a three-dimensional heat map and a stereo map
            1. Getting ready
            2. How to do it…
            3. See also
          8. Constructing a tree map in R
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
        3. 3. Maps
          1. Introduction
          2. Introducing regional maps
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          3. Introducing choropleth maps
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          4. A guide to contour maps
            1. How to do it…
            2. How it works…
            3. There's more…
            4. See also
          5. Constructing maps with bubbles
            1. Getting ready
            2. How to do it…
            3. How it works...
            4. There's more…
            5. See also
          6. Integrating text with maps
            1. Getting ready
            2. How to do it…
            3. See also
          7. Introducing shapefiles
            1. Getting ready
            2. How to do it…
            3. See also
          8. Creating cartograms
            1. Getting ready
            2. How to do it…
            3. See also
        4. 4. The Pie Chart and Its Alternatives
          1. Introduction
          2. Generating a simple pie chart
            1. How to do it…
            2. How it works…
            3. There's more...
            4. See also
          3. Constructing pie charts with labels
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
          4. Creating donut plots and interactive plots
            1. Getting rady
            2. How to do it...
            3. How it works…
            4. There's more…
            5. See also
          5. Generating a slope chart
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          6. Constructing a fan plot
            1. Getting ready
            2. How to do it…
            3. How it works…
        5. 5. Adding the Third Dimension
          1. Introduction
          2. Constructing a 3D scatter plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          3. Generating a 3D scatter plot with text
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          4. A simple 3D pie chart
            1. Getting ready
            2. How to do it…
            3. How it works…
          5. A simple 3D histogram
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more...
          6. Generating a 3D contour plot
            1. Getting ready
            2. How to do it…
            3. How it works…
          7. Integrating a 3D contour and a surface plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more...
            5. See also
          8. Animating a 3D surface plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
        6. 6. Data in Higher Dimensions
          1. Introduction
          2. Constructing a sunflower plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          3. Creating a hexbin plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          4. Generating interactive calendar maps
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          5. Creating Chernoff faces in R
            1. Getting ready
            2. How to do it…
            3. How it works…
          6. Constructing a coxcomb plot in R
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          7. Constructing network plots
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          8. Constructing a radial plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          9. Generating a very basic pyramid plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
        7. 7. Visualizing Continuous Data
          1. Introduction
          2. Generating a candlestick plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          3. Generating interactive candlestick plots
            1. Getting ready
            2. How to do it…
            3. How it works…
          4. Generating a decomposed time series
            1. How to do it…
            2. How it works…
            3. There's more…
            4. See also
          5. Plotting a regression line
            1. How to do it…
            2. How it works…
            3. See also
          6. Constructing a box and whiskers plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          7. Generating a violin plot
            1. Getting ready
            2. How to do it…
          8. Generating a quantile-quantile plot (QQ plot)
            1. Getting ready
            2. How to do it…
            3. See also
          9. Generating a density plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          10. Generating a simple correlation plot
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
        8. 8. Visualizing Text and XKCD-style Plots
          1. Introduction
          2. Generating a word cloud
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          3. Constructing a word cloud from a document
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          4. Generating a comparison cloud
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          5. Constructing a correlation plot and a phrase tree
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          6. Generating plots with custom fonts
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          7. Generating an XKCD-style plot
            1. Getting ready
            2. How to do it…
            3. See also
        9. 9. Creating Applications in R
          1. Introduction
          2. Creating animated plots in R
            1. Getting ready
            2. How to do it…
            3. How it works…
          3. Creating a presentation in R
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. There's more…
            5. See also
          4. A basic introduction to API and XML
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          5. Constructing a bar plot using XML in R
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
          6. Creating a very simple shiny app in R
            1. Getting ready
            2. How to do it…
            3. How it works…
            4. See also
      8. 3. Module 3
        1. 1. Data Exploration with RMS Titanic
          1. Introduction
          2. Reading a Titanic dataset from a CSV file
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          3. Converting types on character variables
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          4. Detecting missing values
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          5. Imputing missing values
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          6. Exploring and visualizing data
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
            5. See also
          7. Predicting passenger survival with a decision tree
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          8. Validating the power of prediction with a confusion matrix
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          9. Assessing performance with the ROC curve
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        2. 2. R and Statistics
          1. Introduction
          2. Understanding data sampling in R
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          3. Operating a probability distribution in R
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          4. Working with univariate descriptive statistics in R
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          5. Performing correlations and multivariate analysis
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          6. Operating linear regression and multivariate analysis
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          7. Conducting an exact binomial test
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          8. Performing student's t-test
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Performing the Kolmogorov-Smirnov test
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          10. Understanding the Wilcoxon Rank Sum and Signed Rank test
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          11. Working with Pearson's Chi-squared test
            1. Getting ready
            2. How to do it
            3. How it works...
            4. There's more...
          12. Conducting a one-way ANOVA
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          13. Performing a two-way ANOVA
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        3. 3. Understanding Regression Analysis
          1. Introduction
          2. Fitting a linear regression model with lm
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          3. Summarizing linear model fits
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          4. Using linear regression to predict unknown values
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          5. Generating a diagnostic plot of a fitted model
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          6. Fitting a polynomial regression model with lm
            1. Getting ready
            2. How to do it...
            3. How it works
            4. There's more...
          7. Fitting a robust linear regression model with rlm
            1. Getting ready
            2. How to do it...
            3. How it works
            4. There's more...
          8. Studying a case of linear regression on SLID data
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Applying the Gaussian model for generalized linear regression
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          10. Applying the Poisson model for generalized linear regression
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          11. Applying the Binomial model for generalized linear regression
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          12. Fitting a generalized additive model to data
            1. Getting ready
            2. How to do it...
            3. How it works
            4. See also
          13. Visualizing a generalized additive model
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          14. Diagnosing a generalized additive model
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
        4. 4. Classification (I) – Tree, Lazy, and Probabilistic
          1. Introduction
          2. Preparing the training and testing datasets
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          3. Building a classification model with recursive partitioning trees
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          4. Visualizing a recursive partitioning tree
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          5. Measuring the prediction performance of a recursive partitioning tree
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          6. Pruning a recursive partitioning tree
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          7. Building a classification model with a conditional inference tree
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          8. Visualizing a conditional inference tree
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Measuring the prediction performance of a conditional inference tree
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          10. Classifying data with the k-nearest neighbor classifier
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          11. Classifying data with logistic regression
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          12. Classifying data with the Naïve Bayes classifier
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        5. 5. Classification (II) – Neural Network and SVM
          1. Introduction
          2. Classifying data with a support vector machine
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          3. Choosing the cost of a support vector machine
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          4. Visualizing an SVM fit
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          5. Predicting labels based on a model trained by a support vector machine
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          6. Tuning a support vector machine
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          7. Training a neural network with neuralnet
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          8. Visualizing a neural network trained by neuralnet
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Predicting labels based on a model trained by neuralnet
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          10. Training a neural network with nnet
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          11. Predicting labels based on a model trained by nnet
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        6. 6. Model Evaluation
          1. Introduction
          2. Estimating model performance with k-fold cross-validation
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          3. Performing cross-validation with the e1071 package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          4. Performing cross-validation with the caret package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          5. Ranking the variable importance with the caret package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          6. Ranking the variable importance with the rminer package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          7. Finding highly correlated features with the caret package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          8. Selecting features using the caret package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Measuring the performance of the regression model
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more…
          10. Measuring prediction performance with a confusion matrix
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          11. Measuring prediction performance using ROCR
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          12. Comparing an ROC curve using the caret package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          13. Measuring performance differences between models with the caret package
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        7. 7. Ensemble Learning
          1. Introduction
          2. Classifying data with the bagging method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          3. Performing cross-validation with the bagging method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          4. Classifying data with the boosting method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          5. Performing cross-validation with the boosting method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          6. Classifying data with gradient boosting
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          7. Calculating the margins of a classifier
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          8. Calculating the error evolution of the ensemble method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Classifying data with random forest
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          10. Estimating the prediction errors of different classifiers
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        8. 8. Clustering
          1. Introduction
          2. Clustering data with hierarchical clustering
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          3. Cutting trees into clusters
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          4. Clustering data with the k-means method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          5. Drawing a bivariate cluster plot
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more
          6. Comparing clustering methods
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          7. Extracting silhouette information from clustering
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          8. Obtaining the optimum number of clusters for k-means
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Clustering data with the density-based method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          10. Clustering data with the model-based method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          11. Visualizing a dissimilarity matrix
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          12. Validating clusters externally
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        9. 9. Association Analysis and Sequence Mining
          1. Introduction
          2. Transforming data into transactions
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          3. Displaying transactions and associations
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          4. Mining associations with the Apriori rule
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          5. Pruning redundant rules
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          6. Visualizing association rules
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          7. Mining frequent itemsets with Eclat
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          8. Creating transactions with temporal information
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Mining frequent sequential patterns with cSPADE
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        10. 10. Dimension Reduction
          1. Introduction
          2. Performing feature selection with FSelector
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          3. Performing dimension reduction with PCA
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          4. Determining the number of principal components using the scree test
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          5. Determining the number of principal components using the Kaiser method
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          6. Visualizing multivariate data using biplot
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          7. Performing dimension reduction with MDS
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          8. Reducing dimensions with SVD
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Compressing images with SVD
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          10. Performing nonlinear dimension reduction with ISOMAP
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. There's more...
          11. Performing nonlinear dimension reduction with Local Linear Embedding
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        11. 11. Big Data Analysis (R and Hadoop)
          1. Introduction
          2. Preparing the RHadoop environment
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          3. Installing rmr2
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          4. Installing rhdfs
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          5. Operating HDFS with rhdfs
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          6. Implementing a word count problem with RHadoop
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          7. Comparing the performance between an R MapReduce program and a standard R program
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          8. Testing and debugging the rmr2 program
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          9. Installing plyrmr
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          10. Manipulating data with plyrmr
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          11. Conducting machine learning with RHadoop
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
          12. Configuring RHadoop clusters on Amazon EMR
            1. Getting ready
            2. How to do it...
            3. How it works...
            4. See also
        12. A. Resources for R and Machine Learning
        13. B. Dataset – Survival of Passengers on the Titanic
      9. A. Bibliography
      10. Index