This video series covers Exploratory Data Analysis (EDA) using R. It is intended for beginners, who have basic programming knowledge in any language, and want to learn R for data analysis. The series starts with setting up the software environment and understanding the basic syntax of R, and then graduates to importing, wrangling, and analyzing data by using specific packages such as Dplyr and Tidyr. We explore EDA through univariate, bivariate, and multivariate analysis, as well as analyze charts using R’s Ggplot2 package. There are ten clips in this series:
- Introducing Exploratory Data Analysis. Exploratory Data Analysis is explained followed by an overview of this series.
- Installing R and RStudio. This clip covers installing and running the R language and RStudio integrated development environment (IDE) on a Windows PC, as well as the use of different panes and tabs in RStudio.
- Writing R Scripts and Markdown Files. This clip covers an overview on the R script and R Markdown Notebook including writing markdown text, writing and running code from the R Markdown Notebook, and knitting a Notebook to a .html file.
- Mastering R Syntax. This clip covers an overview of the R syntax, including the use of strings, integers, and variables in R. We apply in-built functions such as paste, sum, diff, and mean, and discuss For and While loops, If and Else statements, and IN operators. We conclude by exploring creating our own functions in R.
- Using Vectors, Matrices, and Factors. This clip discusses the three important datatypes in R: Vectors, Matrices, and Factors. For vectors, we discuss creating, altering and querying vectors, and applying math functions to vectors. For matrices, we discuss creating matrices, changing their dimensions, querying, and slicing them. For factors, we discuss creating nominal and ordinal factors, changing their levels and other attributes, and querying them.
- Creating Data Frames in R. We cover creating data frames from vectors, as well as importing vectors from external .csv files. Then we discuss querying, slicing, merging, and sub-setting data frames, as well as adding and removing rows and columns from vectors.
- Analyzing In-built Data Sets. This clip covers analyzing R’s in-built datasets. We explore importing an in-built dataset, and calling its dimensions, structures, and other attributes. We also cover the basics of querying rows, columns, and values from a data frame.
- Analyzing External Data with Dplyr. This clip covers cleaning and summarizing data from external files. We introduce the Dplyr package for data wrangling, explore its functions of Filter, Select, Arrange, Summarize, and Mutate, and explain the syntax of Chaining.
- Leveraging Tidyr. This clip introduces the Tidyr package in R, which is very useful when working with data frames. We cover the Unite, Separate, Spread, and Gather functions of Tidyr.
- Analyzing External Data with Ggplot2. This clip applies all of the prior techniques in applying exploratory data analysis to analyze an actual external dataset. We introduce the Ggplot2 package for creating charts, and the Grid and GridExtra packages for creating grids. An R Markdown Notebook is created, having univariate, bivariate, and multivariate analysis of the data. Then the notebook is knitted to an external .html file.