O'Reilly logo
live online training icon Live Online training

Beginning R Programming

Jared Lander

This Online Training is your entry into R where you will build a solid foundation of the first principles of R. We dive right into the basics such as assigning variables and reading data then move into data manipulation and plotting.

What you'll learn-and how you can apply it

  • Assigning Variables
  • Reading Data
  • CSV
  • Excel
  • Daabases
  • json
  • Writing Functions
  • Working with Strings
  • Data Manipulation with dplyr
  • select
  • filter
  • muttate
  • group_by
  • summarize
  • joins
  • Iterating Over Lists with purrr
  • Transforming Data with tidyr
  • Plotting with ggplot2
  • Scatterplots
  • Histograms
  • Violin Plots
  • Faceting

This training course is for you because...

  • Beginner Data Scientists
  • Recovering Excel Jockeys
  • People New to R

Prerequisites

  • Some prior experience with R but not much
  • Some familiarity with basic programming concepts
  • Variables
  • Functions

Materials, downloads, or Supplemental Content needed in advance:

  • R
  • RStudio
  • The following R Packages
  • dplyr
  • tidyr
  • purrr
  • readr
  • readxl
  • odbc
  • RSQLite
  • ggplot2

Sample Diamonds Database from https://data.world/landeranalytics/diamonds

The files at https://data.world/landeranalytics/rforeveryone

Resources:

R for Everyone: Advanced Analytics and Graphics, Second Edition (book)

R Programming for Data Analysts (Learning Path)

Shiny R (video)

About your instructor

  • Jared P. Lander is the Chief Data Scientist of Lander Analytics, a data science and artificial intelligence consulting and training firm based in New York City; the organizer of the New York Open Statistical Programming Meetup—the world’s largest R meetup—–and the New York R Conference); author of R for Everyone and an adjunct professor at Columbia University. With an M.A. from Columbia University in statistics and a B.S. from Muhlenberg College in mathematics, he has experience in both academic research and industry. Very active in the data community, Jared is a frequent speaker at conferences, universities and meetups around the world. His writings on statistics can be found at jaredlander.com and his work has been featured in publications such as Forbes and the Wall Street Journal.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Day 1

Getting Started with R

Reading Data (20 min)

  • CSV
  • Excel
  • SQL Databases

Writing Functions (10 min)

  • Structure
  • Arguments
  • Return values

Group Manipulation with dplyr (60 min)

  • Piping
  • select
  • filter
  • slice
  • mutate
  • group_by
  • summarize
  • join
  • count

Functional iterating with purrr (20 min)

  • map
  • map_int
  • map_dbl
  • map_chr
  • map_lgl

Reshaping data with tidyr (20 min)

  • gather
  • spread

ggplot2 (60 min)

  • Scatterplot
  • Histogram
  • Colors
  • Shapes
  • Sizes
  • Violin Plots

Day 2

Modeling in R

T-tests (15 min)

Simple Linear Regression (20 min)

Dealing with Qualitative Inputs (10 min)

The Formula Interface (20 min)

Multiple Regression (30 min)

Visualizing Models (5 min)

  • coefplot

Generalized Linear Models (30 min)

  • Logistic Regression
  • Poisson Regression

Model Diagnostics (20 min)

  • MSE
  • AIC
  • BIC