O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

R for Data Science Solutions

Video Description

Over 100 hands-on tasks to help you effectively solve real-world data problems using the most popular R packages and techniques

About This Video

  • Gain insight into how data scientists collect, process, analyze, and visualize data using some of the most popular R packages

  • Understand how to apply useful data analysis techniques in R for real-world applications

  • An easy-to-follow guide to make the life of data scientist easier with the problems faced while performing data analysis

  • In Detail

    R is a data analysis software as well as a programming language. Data scientists, statisticians and analysts use R for statistical analysis, data visualization and predictive modeling. R is open source and allows integration with other applications and systems. Compared to other data analysis platforms, R has an extensive set of data products. Problems faced with data are cleared with R’s excellent data visualization feature.

    The first section in this course deals with how to create R functions to avoid the unnecessary duplication of code. You will learn how to prepare, process, and perform sophisticated ETL for heterogeneous data sources with R packages. An example of data manipulation is provided, illustrating how to use the ‘dplyr’ and ‘data.table’ packages to efficiently process larger data structures. We also focus on ‘ggplot2’ and show you how to create advanced figures for data exploration.

    In addition, you will learn how to build an interactive report using the “ggvis” package. Later sections offer insight into time series analysis, while there is detailed information on the hot topic of machine learning, including data classification, regression, clustering, association rule mining, and dimension reduction.

    By the end of this course, you will understand how to resolve issues and will be able to comfortably offer solutions to problems encountered while performing data analysis.

    Table of Contents

    1. Chapter 1 : Functions in R
      1. R Functions and Arguments 00:06:25
      2. Understanding Environments 00:02:59
      3. Working with Lexical Scoping 00:02:49
      4. Understanding Closure 00:02:17
      5. Performing Lazy Evaluation 00:01:56
      6. Creating Infix Operators 00:02:51
      7. Using the Replacement Function 00:02:17
      8. Handling Errors in a Function 00:04:31
      9. The Debugging Function 00:04:05
    2. Chapter 2 : Data Extracting, Transforming, and Loading
      1. Downloading Open Data 00:02:15
      2. Reading and Writing CSV Files 00:01:13
      3. Scanning Text Files 00:02:21
      4. Working with Excel Files 00:01:56
      5. Reading Data from Databases 00:04:04
      6. Scraping Web Data 00:05:17
    3. Chapter 3 : Data Pre-Processing and Preparation
      1. Renaming the Data Variable 00:02:27
      2. Converting Data Types 00:02:36
      3. Working with Date Format 00:02:55
      4. Adding New Records 00:02:09
      5. Filtering Data 00:03:29
      6. Dropping Data 00:01:42
      7. Merging and Sorting Data 00:04:00
      8. Reshaping Data 00:02:42
      9. Detecting Missing Data 00:03:15
      10. Imputing Missing Data 00:04:03
    4. Chapter 4 : Data Manipulation
      1. Enhancing a data.frame with a data.table 00:04:50
      2. Managing Data with data.table 00:04:14
      3. Performing Fast Aggregation with data.table 00:02:10
      4. Merging Large Datasets with a data.table 00:02:41
      5. Subsetting and Slicing Data with dplyr 00:02:09
      6. Sampling Data with dplyr 00:01:26
      7. Selecting Columns with dplyr 00:02:40
      8. Chaining Operations in dplyr 00:02:10
      9. Arranging Rows with dplyr 00:01:22
      10. Eliminating Duplicated Rows with dplyr 00:01:40
      11. Adding New Columns with dplyr 00:01:14
      12. Summarizing Data with dplyr 00:01:54
      13. Merging Data with dplyr 00:02:11
    5. Chapter 5 : Visualizing Data with ggplot2
      1. Creating Basic Plots with ggplot2 00:04:15
      2. Changing Aesthetics Mapping 00:03:09
      3. Introducing Geometric Objects 00:03:13
      4. Performing Transformations 00:03:27
      5. Adjusting Scales 00:02:16
      6. Faceting 00:02:07
      7. Adjusting Themes 00:01:33
      8. Combining Plots 00:02:04
      9. Creating Maps 00:04:39
    6. Chapter 6 : Making Interactive Reports
      1. Creating R Markdown Reports 00:02:47
      2. Learning the Markdown Syntax 00:03:14
      3. Embedding R Code Chunks 00:02:19
      4. Creating Interactive Graphics with ggvis 00:02:39
      5. Understanding Basic Syntax and Gramma 00:01:57
      6. Controlling Axes and Legends and Using Scales 00:02:55
      7. Adding Interactivity to a ggvis Plot 00:03:41
      8. Creating an R Shiny Document 00:02:16
      9. Publishing an R Shiny Report 00:02:29
    7. Chapter 7 : Simulation from Probability Distributions
      1. Generating Random Samples 00:02:52
      2. Understanding Uniform Distributions 00:01:39
      3. Generating Binomial Random Variates 00:02:30
      4. Generating Poisson Random Variates 00:02:06
      5. Sampling from a Normal Distribution 00:04:08
      6. Sampling from a Chi-Squared Distribution 00:02:00
      7. Understanding Student's t- Distribution 00:02:11
      8. Sampling from a Dataset 00:01:52
      9. Simulating the Stochastic Process 00:02:29
    8. Chapter 8 : Statistical Inference in R
      1. Getting Confidence Intervals 00:05:54
      2. Performing Z-tests 00:03:12
      3. Performing Student's t-Tests 00:02:15
      4. Conducting Exact Binomial Tests 00:02:09
      5. Performing Kolmogorov-Smirnov Tests 00:02:17
      6. Working with the Pearson's Chi-Squared Tests 00:01:40
      7. Understanding the Wilcoxon Rank Sum and Signed Rank Tests 00:01:48
      8. Conducting One-way ANOVA 00:02:39
      9. Performing Two-way ANOVA 00:03:02
    9. Chapter 9 : Rule and Pattern Mining with R
      1. Transforming Data into Transactions 00:05:12
      2. Displaying Transactions and Associations 00:03:03
      3. Mining Associations with the Apriori Rule 00:04:19
      4. Pruning Redundant Rules 00:02:15
      5. Visualizing Association Rules 00:02:36
      6. Mining Frequent Itemsets with Eclat 00:03:08
      7. Creating Transactions with Temporal Information 00:02:53
      8. Mining Frequent Sequential Patterns with cSPADE 00:02:42
    10. Chapter 10 : Time Series Mining with R
      1. Creating Time Series Data 00:05:12
      2. Plotting a Time Series Object 00:02:26
      3. Decomposing Time Series 00:02:11
      4. Smoothing Time Series 00:05:21
      5. Forecasting Time Series 00:02:31
      6. Selecting an ARIMA Model 00:03:19
      7. Creating an ARIMA Model 00:02:20
      8. Forecasting with an ARIMA Model 00:02:11
      9. Predicting Stock Prices with an ARIMA Model 00:04:24
    11. Chapter 11 : Supervised Machine Learning
      1. Fitting a Linear Regression Model with lm 00:05:35
      2. Summarizing Linear Model Fits 00:02:54
      3. Using Linear Regression to Predict Unknown Values 00:03:57
      4. Measuring the Performance of the Regression Model 00:03:23
      5. Performing a Multiple Regression Analysis 00:04:17
      6. Selecting the Best-Fitted Regression Model with Stepwise Regression 00:02:42
      7. Applying the Gaussian Model for Generalized Linear Regression 00:02:19
      8. Performing a Logistic Regression Analysis 00:04:31
      9. Building a Classification Model with Recursive Partitioning Trees 00:03:59
      10. Visualizing Recursive Partitioning Tree 00:02:14
      11. Measuring Model Performance with a Confusion Matrix 00:01:38
      12. Measuring Prediction Performance Using ROCR 00:03:46
    12. Chapter 12 : Unsupervised Machine Learning
      1. Clustering Data with Hierarchical Clustering 00:06:10
      2. Cutting Tree into Clusters 00:01:45
      3. Clustering Data with the k-means Method 00:02:09
      4. Clustering Data with the Density-Based Method 00:03:12
      5. Extracting Silhouette Information from Clustering 00:01:50
      6. Comparing Clustering Methods 00:02:12
      7. Recognizing Digits Using the Density-Based Clustering Method 00:01:52
      8. Grouping Similar Text Documents with k-means Clustering Method 00:02:15
      9. Performing Dimension Reduction with Principal Component Analysis (PCA) 00:02:51
      10. Determining the Number of Principal Components Using a Scree Plot 00:01:51
      11. Determining the Number of Principal Components Using the Kaiser Method 00:01:20
      12. Visualizing Multivariate Data Using a biplot 00:02:54