O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Learning Path: R Programming

Video Description

Learn and master the industry-standard language for data science

In Detail

Data is on the rise and it's the need of the hour to process it and make sense out it. Analysts and statisticians need to get this job done. It's an art to tactfully and efficiently process data. But, as it goes an art becomes a reality only with the help of right tools and the knowledge of using these right. So, it is with data science. R is a powerful language that provides with all the tools required to build probabilistic models, perform data science, and build machine learning algorithms. If you're a data scientist looking at exploring this language, this Learning Path is for you. You'll be introduced to RStudio and the basics of R. Then, you'll taken through a number of topics such as handling dates with the lubridate package, handling strings with the stringr package, and making statistical inferences. Finally, the focus will be on machine learning concepts in depth and applying them in the real world with R.

Prerequisites: This is for absolute beginners. No prior knowledge of R is required.

Resources: Code downloads and errata:

  • Introduction to R Programming

  • Mastering R Programming

  • PATH PRODUCTS

    This path navigates across the following products (in sequential order):

  • Introduction to R Programming (3h 46m)

  • Mastering R Programming (5h 12m)

  • Table of Contents

    1. Chapter 1 : Introduction to R Programming
      1. The Course Overview 00:04:54
      2. Installing R 00:03:46
      3. Installing RStudio 00:04:36
      4. Installing Packages 00:04:50
      5. Data Types and Data Structures 00:03:05
      6. Vectors 00:05:44
      7. Random Numbers, Rounding, and Binning 00:04:00
      8. Missing Values 00:02:47
      9. The which() Operator 00:03:11
      10. Lists 00:04:35
      11. Set Operations 00:02:09
      12. Sampling and Sorting 00:02:52
      13. Check Conditions 00:02:17
      14. For Loops 00:02:34
      15. Dataframes 00:08:30
      16. Importing and Exporting Data 00:06:30
      17. Matrices and Frequency Tables 00:03:41
      18. Merging Dataframes 00:02:26
      19. Aggregation 00:02:48
      20. Melting and Cross Tabulations with dcast() 00:03:58
      21. Dates 00:05:35
      22. String Manipulation 00:05:14
      23. Functions 00:05:34
      24. Debugging and Error Handling 00:04:30
      25. Fast Loops with apply() 00:04:27
      26. Fast Loops with sapply(), lapply() and vapply() 00:02:00
      27. Creating and Customizing an R Plot 00:07:03
      28. Drawing Plots with 2 Y Axes 00:02:23
      29. Multiplots and Custom Layouts 00:03:08
      30. Creating Basic Graph Types 00:04:47
      31. Univariate Analysis 00:06:16
      32. Normal Distribution, Central Limit Theorem, and Confidence Intervals 00:05:32
      33. Correlation and Covariance 00:03:03
      34. Chi-sq Statistic 00:04:42
      35. ANOVA 00:04:54
      36. Statistical Tests 00:05:14
      37. Project 1 – Data Munging and Summarizing 00:11:31
      38. Project 2 – Visualization with Base Graphics 00:05:42
      39. Project 3 – Statistical Inference 00:03:50
      40. Pipes with Magrittr 00:05:21
      41. The 7 Data Manipulation Verbs 00:05:19
      42. Aggregation and Special Functions 00:03:36
      43. Two Table Verbs 00:02:43
      44. Working With Databases 00:05:30
      45. Understanding Basics, Filter, and Select 00:07:34
      46. Understanding Syntax, Creating and Updating Columns 00:04:06
      47. Aggregating Data, .N, and .I 00:04:21
      48. data.table 00:04:17
      49. Fast Loops with set(), Keys, and Joins 00:09:13
    2. Chapter 2 : Mastering R Programming
      1. The Course Overview 00:07:45
      2. Performing Univariate Analysis 00:05:22
      3. Bivariate Analysis – Correlation, Chi-Sq Test, and ANOVA 00:05:43
      4. Detecting and Treating Outlier 00:03:21
      5. Treating Missing Values with `mice` 00:03:59
      6. Building Linear Regressors 00:07:35
      7. Interpreting Regression Results and Interactions Terms 00:05:19
      8. Performing Residual Analysis and Extracting Extreme Observations With Cook's Distance 00:03:25
      9. Extracting Better Models with Best Subsets, Stepwise Regression, and ANOVA 00:04:39
      10. Validating Model Performance on New Data with k-Fold Cross Validation 00:02:29
      11. Building Non-Linear Regressors with Splines and GAMs 00:05:20
      12. Building Logistic Regressors, Evaluation Metrics, and ROC Curve 00:12:38
      13. Understanding the Concept and Building Naive Bayes Classifier 00:09:24
      14. Building k-Nearest Neighbors Classifier 00:07:01
      15. Building Tree Based Models Using RPart, cTree, and C5.0 00:06:33
      16. Building Predictive Models with the caret Package 00:08:11
      17. Selecting Important Features with RFE, varImp, and Boruta 00:05:19
      18. Building Classifiers with Support Vector Machines 00:08:04
      19. Understanding Bagging and Building Random Forest Classifier 00:05:07
      20. Implementing Stochastic Gradient Boosting with GBM 00:05:18
      21. Regularization with Ridge, Lasso, and Elasticnet 00:08:53
      22. Building Classifiers and Regressors with XGBoost 00:10:10
      23. Dimensionality Reduction with Principal Component Analysis 00:05:05
      24. Clustering with k-means and Principal Components 00:03:16
      25. Determining Optimum Number of Clusters 00:05:25
      26. Understanding and Implementing Hierarchical Clustering 00:02:36
      27. Clustering with Affinity Propagation 00:05:25
      28. Building Recommendation Engines 00:09:01
      29. Understanding the Components of a Time Series, and the xts Package 00:05:42
      30. Stationarity, De-Trend, and De-Seasonalize 00:04:07
      31. Understanding the Significance of Lags, ACF, PACF, and CCF 00:03:49
      32. Forecasting with Moving Average and Exponential Smoothing 00:02:25
      33. Forecasting with Double Exponential and Holt Winters 00:03:23
      34. Forecasting with ARIMA Modelling 00:05:26
      35. Scraping Web Pages and Processing Texts 00:09:24
      36. Corpus, TDM, TF-IDF, and Word Cloud 00:09:07
      37. Cosine Similarity and Latent Semantic Analysis 00:07:20
      38. Extracting Topics with Latent Dirichlet Allocation 00:05:07
      39. Sentiment Scoring with tidytext and Syuzhet 00:04:23
      40. Classifying Texts with RTextTools 00:03:57
      41. Building a Basic ggplot2 and Customizing the Aesthetics and Themes 00:07:18
      42. Manipulating Legend, AddingText, and Annotation 00:03:31
      43. Drawing Multiple Plots with Faceting and Changing Layouts 00:03:18
      44. Creating Bar Charts, Boxplots, Time Series, and Ribbon Plots 00:05:25
      45. ggplot2 Extensions and ggplotly 00:03:11
      46. Implementing Best Practices to Speed Up R Code 00:05:47
      47. Implementing Parallel Computing with doParallel and foreach 00:04:22
      48. Writing Readable and Fast R Code with Pipes and DPlyR 00:05:40
      49. Writing Super Fast R Code with Minimal Keystrokes Using Data.Table 00:06:38
      50. Interface C++ in R with RCpp 00:11:09
      51. Understanding the Structure of an R Package 00:05:02
      52. Build, Document, and Host an R Package on GitHub 00:07:10
      53. Performing Important Checks Before Submitting to CRAN 00:04:06
      54. Submitting an R Package to CRAN 00:03:11