You are previewing Sams Teach Yourself R in 24 Hours.
O'Reilly logo
Sams Teach Yourself R in 24 Hours

Book Description

In just 24 lessons of one hour or less, Sams Teach Yourself R in 24 Hours helps you learn all the R skills you need to solve a wide spectrum of real-world data analysis problems. You’ll master the entire data analysis workflow, learning to build code that’s efficient, reproducible, and suitable for sharing with others.


This book’s straightforward, step-by-step approach teaches you how to import, manipulate, summarize, model, and plot data with R; formalize your analytical code; and build powerful R packages using current best practices.

Practical, hands-on examples show you how to apply what you learn.
Quizzes and exercises help you test your knowledge and stretch your skills.

Learn How To

  • Install, configure, and explore the R environment, including RStudio

  • Use basic R syntax, objects, and packages

  • Create and manage data structures, including vectors, matrices, and arrays

  • Understand lists and data frames

  • Work with dates, times, and factors

  • Use common R functions, and learn to write your own

  • Import and export data and connect to databases and spreadsheets

  • Use the popular tidyr, dplyr and data.table packages

  • Write more efficient R code with profiling, vectorization, and initialization

  • Plot data and extend your graphical capabilities with ggplot2 and Lattice graphics

  • Develop common types of models

  • Construct high-quality packages, both simple and complex

  • Write R classes: S3, S4, and Reference Classes

  • Use R to generate dynamic reports

  • Build web applications with Shiny

  • Register your book at informit.com/register for convenient access to updates and corrections as they become available.

    This book’s source code can be found at http://www.mango-solutions.com/wp/teach-yourself-r-in-24-hours-book/.

    Table of Contents

    1. About This E-Book
    2. Title Page
    3. Copyright Page
    4. Contents at a Glance
    5. Table of Contents
    6. Preface
      1. Who Should Read This Book?
        1. What Should You Expect from This Book?
        2. How Is This Book Organized?
      2. About the Sample Code
      3. Contacting the Authors
    7. About the Authors
    8. Dedications
    9. Acknowledgments
    10. We Want to Hear from You!
    11. Reader Services
    12. Hour 1. The R Community
      1. A Concise History of R
        1. The Birth of S
        2. The Birth of R
      2. The R Community
        1. Mailing Lists
        2. R Manuals
        3. Online Resources
        4. The R Consortium
        5. User Events
      3. R Development
        1. Versions of R
      4. Summary
      5. Q&A
      6. Workshop
        1. Quiz
        2. Answers
      7. Activities
    13. Hour 2. The R Environment
      1. Integrated Development Environments
        1. The R GUI
        2. The RStudio IDE
        3. Other Development Environments
      2. R Syntax
        1. The Console
        2. Scripting
      3. R Objects
        1. R Packages
        2. The Search Path
        3. Listing Objects
        4. The R Workspace
      4. Using R Packages
        1. Finding the Right Package
        2. Installing an R Package
        3. Loading an R Package
      5. Internal Help
      6. Summary
      7. Q&A
      8. Workshop
        1. Quiz
        2. Answers
      9. Activities
    14. Hour 3. Single-Mode Data Structures
      1. The R Data Types
        1. The mode Function
      2. Vectors, Matrices, and Arrays
      3. Vectors
        1. Creating Vectors
        2. Vector Attributes
        3. Subscripting Vectors
      4. Matrices
        1. Creating Matrices
        2. Matrix Attributes
        3. Subscripting Matrices
        4. Subscripting Matrices: Blanks, Positives, and Negatives
        5. Dropping Dimensions
        6. Subscripting Matrices: Logical Values
        7. Subscripting Matrices: Character Values
      5. Arrays
        1. Creating Arrays
        2. Array Attributes
        3. Subscripting Arrays
      6. Relationship Between Single-Mode Data Objects
      7. Summary
      8. Q&A
      9. Workshop
        1. Quiz
        2. Answers
      10. Activities
    15. Hour 4. Multi-Mode Data Structures
      1. Multi-Mode Structures
      2. Lists
        1. What Is a List?
        2. Creating an Empty List
        3. Creating a Non-Empty List
        4. Creating a List with Element Names
        5. Creating a List: A Summary
        6. List Attributes
        7. Subscripting Lists
        8. Subsetting the List
        9. Reference List Elements
        10. Adding List Elements
        11. A Summary of List Syntax
        12. Motivation for Lists
        13. Value
      3. Data Frames
        1. Creating a Data Frame
        2. Querying Data Frame Attributes
        3. Selecting Columns from the Data Frame
        4. Selecting Columns from the Data Frame
        5. Subscripting Columns
        6. Referencing as a Matrix
        7. Summary of Subscripting Data Frames
      4. Exploring Your Data
        1. The Top and Bottom of Your Data
        2. Viewing Your Data
        3. Summarizing Your Data
        4. Visualizing Your Data
      5. Summary
      6. Q&A
      7. Workshop
        1. Quiz
        2. Answers
      8. Activities
    16. Hour 5. Dates, Times, and Factors
      1. Working with Dates and Times
        1. Creating Date Objects
        2. Creating Objects That Include Times
        3. Manipulating Dates and Times
      2. The lubridate Package
      3. Working with Categorical Data
        1. Creating Factors
        2. Manipulating Factor Levels
        3. Creating Factors from Continuous Data
      4. Summary
      5. Q&A
      6. Workshop
        1. Quiz
        2. Answers
      7. Activities
    17. Hour 6. Common R Utility Functions
      1. Using R Functions
      2. Functions for Numeric Data
        1. Mathematical Functions and Operators
        2. Statistical Summary Functions
        3. Simulation and Statistical Distributions
      3. Logical Data
      4. Missing Data
      5. Character Data
        1. Simple Character Manipulation
        2. Searching and Replacing
      6. Summary
      7. Q&A
      8. Workshop
        1. Quiz
        2. Answers
      9. Activities
    18. Hour 7. Writing Functions: Part I
      1. The Motivation for Functions
        1. A Closer Look at an R Function
      2. Creating a Simple Function
        1. Naming a Function
        2. Defining Function Arguments
        3. Function Scoping Rules
        4. Return Objects
      3. The If/Else Structure
        1. A Simple R Example
        2. Nested Statements
        3. Using One Condition
        4. Multiple Test Values
        5. Summarizing to a Single Logical
        6. Switching with Logical Input
        7. Reversing Logical Values
        8. Mixing Conditions
        9. Control And/Or Statements
        10. Returning Early
        11. A Worked Example
      4. Summary
      5. Q&A
      6. Workshop
        1. Quiz
        2. Answers
      7. Activities
    19. Hour 8. Writing Functions: Part II
      1. Errors and Warnings
        1. Error Messages
        2. Warning Messages
      2. Checking Inputs
      3. The Ellipsis
        1. Using the Ellipsis
        2. Passing Graphical Parameters Using the Ellipsis
      4. Checking Multivalue Inputs
      5. Using Input Definition
      6. Summary
      7. Q&A
      8. Workshop
        1. Quiz
        2. Answers
      9. Activities
    20. Hour 9. Loops and Summaries
      1. Repetitive Tasks
        1. What Is a Loop?
        2. The for Function
        3. The while Function
      2. The “apply” Family of Functions
        1. The Set of “apply” Functions
      3. The apply Function
        1. The “Margin”
        2. A Simple apply Example
        3. Using Multiple Margins
        4. Using apply with Higher Dimension Structures
        5. Passing Extra Arguments to the “applied” Function
        6. Using apply with Our Own Functions
        7. Passing Extra Arguments to Our Functions
        8. Applying to Data Frames
      4. The lapply Function
        1. The split Function
        2. Splitting Data Frames
        3. Using lapply with Vectors
        4. The Order of “apply” Inputs
        5. Using lapply with Data Frames
      5. The sapply Function
        1. Returns from sapply
        2. Why Not Just Stick with sapply?
      6. The tapply Function
        1. Multiple Grouping Variables
        2. Multiple Returns
        3. Return Values from tapply
      7. Summary
      8. Q&A
      9. Workshop
        1. Quiz
        2. Answers
      10. Activities
    21. Hour 10. Importing and Exporting
      1. Working with Text Files
        1. Reading in Text Files
        2. Reading in CSV Files
        3. Exporting Text Files
        4. Faster Imports and Exports
        5. Efficient Data Storage
        6. Proprietary and Other Formats
      2. Relational Databases
        1. RODBC
        2. DBI
      3. Working with Microsoft Excel
        1. Connecting to R from Excel
      4. Summary
      5. Q&A
      6. Workshop
        1. Quiz
        2. Answers
      7. Activities
    22. Hour 11. Data Manipulation and Transformation
      1. Sorting
        1. Sorting Data Frames
        2. Descending Sorts
      2. Appending
      3. Merging
        1. A Merge Example
        2. Missing Data
      4. Duplicate Values
      5. Restructuring
        1. Restructuring with reshape
        2. Melting
        3. Casting
        4. Restructuring with tidyr
      6. Data Aggregation
        1. Using a “for” Loop
        2. Using an “apply” Function
        3. The aggregate Function
        4. Using aggregate with a Formula
        5. Using aggregate by Specifying Columns
        6. Calculating Differences from Baseline
      7. Summary
      8. Q&A
      9. Workshop
        1. Quiz
        2. Answers
      10. Activities
    23. Hour 12. Efficient Data Handling in R
      1. dplyr: A New Way of Handling Data
        1. Creating a dplyr (tbl_df) Object
        2. Sorting
        3. Subscripting
        4. Adding New Columns
        5. Merging
        6. Aggregation
        7. The Pipe Operator
      2. Efficient Data Handling with data.table
        1. Creating a data.table
        2. Setting a Key
        3. Subscripting
        4. Adding New Columns and Rows
        5. Merging
        6. Aggregation
        7. Too Large for data.table
      3. Summary
      4. Q&A
      5. Workshop
        1. Quiz
        2. Answers
      6. Activities
    24. Hour 13. Graphics
      1. Graphics Devices and Colors
        1. Devices
        2. Colors
      2. High-Level Graphics Functions
        1. Univariate Graphics
        2. The plot Function
        3. Aesthetics
      3. Low-Level Graphics Functions
        1. Points and Lines
        2. Text
        3. Legends
        4. Other Low-Level Functions
      4. Graphical Parameters
      5. Controlling the Layout
        1. Grid Layouts
        2. The layout Function
      6. Summary
      7. Q&A
      8. Workshop
        1. Quiz
        2. Answers
      9. Activities
    25. Hour 14. The ggplot2 Package for Graphics
      1. The Philosophy of ggplot2
      2. Quick Plots and Basic Control
        1. Using qplot
        2. Titles and Axes
        3. Working with Layers
        4. Plots as Objects
      3. Changing Plot Types
        1. Plot Types
        2. Combining Plot Types
      4. Aesthetics
        1. Control of Aesthetics
        2. Scales and the Legend
        3. Working with Grouped Data
      5. Paneling (a.k.a Faceting)
        1. Using facet_grid
        2. Using facet_wrap
        3. Faceting from qplot
      6. Custom Plots
        1. Working with ggplot
        2. Coordinate Systems
      7. Themes and Layout
        1. Tweaking Individual Plots
        2. Global Themes
        3. Legend Layout
      8. The ggvis Evolution
      9. Summary
      10. Q&A
      11. Workshop
        1. Quiz
        2. Answers
      12. Activities
    26. Hour 15. Lattice Graphics
      1. The History of Trellis Graphics
      2. The Lattice Package
      3. Creating a Simple Lattice Graph
        1. Lattice Graph Types
        2. Plotting Subsets of Data
      4. Graph Options
        1. Titles and Axes
        2. Plot Types and Formatting
      5. Multiple Variables
      6. Groups of Data
      7. Using Panels
        1. Controlling the Strip Headers
        2. Multiple “By” Variables
        3. Panel Functions
      8. Controlling Styles
        1. Previewing the Styles
        2. Creating a Theme
        3. Using a Theme
      9. Summary
      10. Q&A
      11. Workshop
        1. Quiz
        2. Answers
      12. Activities
    27. Hour 16. Introduction to R Models and Object Orientation
      1. Statistical Models in R
      2. Simple Linear Models
        1. Fitting the Model
      3. Assessing a Model in R
        1. Model Summaries
        2. Model Diagnostic Plots
        3. Extracting Model Elements
        4. Models as List Objects
        5. Adding Model Lines to Plots
        6. Making Model Predictions
      4. Multiple Linear Regression
        1. Updating Models
        2. Comparing Nested Models
      5. Interaction Terms
        1. Assess Addition of Interaction Term
      6. Factor Independent Variables
        1. Including Factors
      7. Variable Transformations
      8. R and Object Orientation
        1. Object Orientation
        2. Linear Model Methods
      9. Summary
      10. Q&A
      11. Workshop
        1. Quiz
        2. Answers
      12. Activities
    28. Hour 17. Common R Models
      1. Generalized Linear Models
        1. GLM Definition
        2. Fitting a GLM
        3. Fitting Gaussian Models
        4. The glm Object
        5. Logistic Regression
        6. Poisson Regression
        7. GLM Extensions
      2. Nonlinear Models
        1. Nonlinear Regression
        2. Nonlinear Model Extensions
      3. Survival Analysis
        1. The ovarian Data Frame
        2. Censoring
        3. Estimating the Survival Function
        4. Proportional Hazards
        5. Survival Model Extensions
      4. Time Series Analysis
        1. Time Series Objects
        2. Decomposing Time Series
        3. Smoothing
        4. Autocorrelations
        5. Fitting ARIMA Models
      5. Summary
      6. Q&A
      7. Workshop
        1. Quiz
        2. Answers
      8. Activities
    29. Hour 18. Code Efficiency
      1. Determining Efficiency
        1. Profiling Code
        2. Benchmarking
      2. Initialization
      3. Vectorization
        1. What Is Vectorization?
        2. How Code Can Be Vectorized
      4. Using Alternative Functions
      5. Managing Memory Usage
      6. Integrating with C++
        1. When to Think about C++ and Rcpp
        2. A Basic Function
        3. Using R Functions in C++
      7. Summary
      8. Q&A
      9. Workshop
        1. Quiz
        2. Answers
      10. Activities
    30. Hour 19. Package Building
      1. Why Build an R Package?
      2. The Structure of an R Package
        1. Creating the Package Structure
        2. The DESCRIPTION File
        3. The NAMESPACE File
        4. The R Directory
        5. The man Directory
      3. Code Quality
      4. Automated Documentation with roxygen2
        1. Function Headers
        2. Documenting the Package
        3. Creating and Updating the Help Pages
      5. Building a Package with devtools
        1. Checking
        2. Building
        3. Installing
      6. Summary
      7. Q&A
      8. Workshop
        1. Quiz
        2. Answers
      9. Activities
    31. Hour 20. Advanced Package Building
      1. Extending R Packages
      2. Developing a Test Framework
        1. An Introduction to testthat
        2. Incorporating Tests into a Package
      3. Including Data in Packages
      4. Including a User Guide
        1. Including a Vignette in a Package
        2. Writing a Vignette
      5. Code Using Rcpp
      6. Summary
      7. Q&A
      8. Workshop
        1. Quiz
        2. Answers
      9. Activities
    32. Hour 21. Writing R Classes
      1. What Is a Class?
        1. Object Orientation in R
        2. Why Bother with Object Orientation?
        3. Why Use S3?
      2. Creating a New S3 Class
        1. A More Formal Approach to Creating Classes
      3. Generic Functions and Methods
        1. Defining Methods for Arithmetic Operators
        2. Lists vs. Attributes
        3. Creating New Generics
      4. Inheritance in S3
      5. Documenting S3
      6. Limitations of S3
      7. Summary
      8. Q&A
      9. Workshop
        1. Quiz
        2. Answers
      10. Activities
    33. Hour 22. Formal Class Systems
      1. S4
        1. Working with S4 Classes
        2. Defining an S4 Class
        3. Methods
        4. Defining New Generics
        5. Multiple Dispatch
        6. Inheritance
        7. Documenting S4
      2. Reference Classes
        1. Creating a New Reference Class
        2. Defining Methods
        3. Copying Reference Class Objects
        4. Documenting Reference Classes
      3. R6 Classes
        1. Public and Private Members
        2. An R6 Example
      4. Other Class Systems
      5. Summary
      6. Q&A
      7. Workshop
        1. Quiz
        2. Answers
      8. Activities
    34. Hour 23. Dynamic Reporting
      1. What Is Dynamic Reporting?
      2. An Introduction to knitr
      3. Simple Reports with RMarkdown
        1. A Basic RMarkdown Document
        2. Building an HTML File
        3. Including R Code and Output
      4. Reporting with LaTeX
        1. A Basic LaTeX Document
        2. Including Code in a LaTeX Document
      5. Summary
      6. Q&A
      7. Workshop
        1. Quiz
        2. Answers
      8. Activities
    35. Hour 24. Building Web Applications with Shiny
      1. A Simple Shiny Application
        1. Structure of a Shiny Application
        2. The ui Component
        3. The server Component
      2. Reactive Functions
        1. Why Do We Need Reactive Functions?
        2. Creating a Simple Reactive Function
      3. Interactive Documents
      4. Sharing Shiny Applications
      5. Summary
      6. Q&A
      7. Workshop
        1. Quiz
        2. Answers
      8. Activities
    36. Appendix: Installation
      1. Installing R
        1. Installing R on Windows
        2. Installing R on Mac OS X
        3. Installing R on Linux
      2. Installing Rtools for Windows
      3. Installing the RStudio IDE
    37. Index
    38. Code Snippets