You are previewing Advanced R.
O'Reilly logo
Advanced R

Book Description

An Essential Reference for Intermediate and Advanced R Programmers

Advanced R presents useful tools and techniques for attacking many types of R programming problems, helping you avoid mistakes and dead ends. With more than ten years of experience programming in R, the author illustrates the elegance, beauty, and flexibility at the heart of R.

The book develops the necessary skills to produce quality code that can be used in a variety of circumstances. You will learn:

  • The fundamentals of R, including standard data types and functions
  • Functional programming as a useful framework for solving wide classes of problems
  • The positives and negatives of metaprogramming
  • How to write fast, memory-efficient code

This book not only helps current R users become R programmers but also shows existing programmers what’s special about R. Intermediate R programmers can dive deeper into R and learn new strategies for solving diverse problems while programmers from other languages can learn the details of R and understand why R works the way it does.

Table of Contents

  1. Preliminaries
  2. Dedication
  3. Chapman & Hall/CRC: The R Series
  4. Chapter 1 - Introduction
    1. 1.1 Who should read this book
    2. 1.2 What you will get out of this book
    3. 1.3 Meta-techniques
    4. 1.4 Recommended reading
    5. 1.5 Getting help
    6. 1.6 Acknowledgments
    7. 1.7 Conventions
    8. 1.8 Colophon
  5. Section I - Foundations
    1. Chapter 2 - Data structures
        1. Quiz
        2. Outline
      1. 2.1 Vectors
        1. 2.1.1 Atomic vectors
          1. 2.1.1.1 Types and tests
          2. 2.1.1.2 Coercion
        2. 2.1.2 Lists
        3. 2.1.3 Exercises
      2. 2.2 Attributes
          1. 2.2.0.1 Names
        1. 2.2.1 Factors
        2. 2.2.2 Exercises
      3. 2.3 Matrices and arrays
        1. 2.3.1 Exercises
      4. 2.4 Data frames
        1. 2.4.1 Creation
        2. 2.4.2 Testing and coercion
        3. 2.4.3 Combining data frames
        4. 2.4.4 Special columns
        5. 2.4.5 Exercises
      5. 2.5 Answers
    2. Chapter 3 - Subsetting
          1. Quiz
          2. Outline
      1. 3.1 Data types
        1. 3.1.1 Atomic vectors
        2. 3.1.2 Lists
        3. 3.1.3 Matrices and arrays
        4. 3.1.4 Data frames
        5. 3.1.5 S3 objects
        6. 3.1.6 S4 objects
        7. 3.1.7 Exercises
      2. 3.2 Subsetting operators
        1. 3.2.1 Simplifying vs. preserving subsetting
        2. 3.2.2 $
        3. 3.2.3 Missing/out of bounds indices
        4. 3.2.4 Exercises
      3. 3.3 Subsetting and assignment
      4. 3.4 Applications
        1. 3.4.1 Lookup tables (character subsetting)
        2. 3.4.2 Matching and merging by hand (integer subsetting)
        3. 3.4.3 Random samples/bootstrap (integer subsetting)
        4. 3.4.4 Ordering (integer subsetting)
        5. 3.4.5 Expanding aggregated counts (integer subsetting)
        6. 3.4.6 Removing columns from data frames (character subsetting)
        7. 3.4.7 Selecting rows based on a condition (logical subsetting)
        8. 3.4.8 Boolean algebra vs. sets (logical & integer subsetting)
        9. 3.4.9 Exercises
      5. 3.5 Answers
    3. Chapter 4 - Vocabulary
      1. 4.1 The basics
      2. 4.2 Common data structures
      3. 4.3 Statistics
      4. 4.4 Working with R
      5. 4.5 I/O
    4. Chapter 5 - Style guide
      1. 5.1 Notation and naming
        1. 5.1.1 File names
        2. 5.1.2 Object names
      2. 5.2 Syntax
        1. 5.2.1 Spacing
        2. 5.2.2 Curly braces
        3. 5.2.3 Line length
        4. 5.2.4 Indentation
        5. 5.2.5 Assignment
      3. 5.3 Organisation
        1. 5.3.1 Commenting guidelines
    5. Chapter 6 - Functions
          1. Quiz
          2. Outline
          3. Prerequisites
      1. 6.1 Function components
        1. 6.1.1 Primitive functions
        2. 6.1.2 Exercises
      2. 6.2 Lexical scoping
        1. 6.2.1 Name masking
        2. 6.2.2 Functions vs. variables
        3. 6.2.3 A fresh start
        4. 6.2.4 Dynamic lookup
        5. 6.2.5 Exercises
      3. 6.3 Every operation is a function call
      4. 6.4 Function arguments
        1. 6.4.1 Calling functions
        2. 6.4.2 Calling a function given a list of arguments
        3. 6.4.3 Default and missing arguments
        4. 6.4.4 Lazy evaluation
        5. 6.4.5 . . .
        6. 6.4.6 Exercises
      5. 6.5 Special calls
        1. 6.5.1 Infix functions
        2. 6.5.2 Replacement functions
        3. 6.5.3 Exercises
      6. 6.6 Return values
        1. 6.6.1 On exit
        2. 6.6.2 Exercises
      7. 6.7 Quiz answers
    6. Chapter 7 - OO field guide
          1. Prerequisites
          2. Quiz
          3. Outline
      1. 7.1 Base types
      2. 7.2 S3
        1. 7.2.1 Recognising objects, generic functions, and methods
        2. 7.2.2 Defining classes and creating objects
        3. 7.2.3 Creating new methods and generics
        4. 7.2.4 Method dispatch
        5. 7.2.5 Exercises
      3. 7.3 S4
        1. 7.3.1 Recognising objects, generic functions, and methods
        2. 7.3.2 Defining classes and creating objects
        3. 7.3.3 Creating new methods and generics
        4. 7.3.4 Method dispatch
        5. 7.3.5 Exercises
      4. 7.4 RC
        1. 7.4.1 Defining classes and creating objects
        2. 7.4.2 Recognising objects and methods
        3. 7.4.3 Method dispatch
        4. 7.4.4 Exercises
      5. 7.5 Picking a system
      6. 7.6 Quiz answers
    7. Chapter 8 - Environments
          1. Quiz
          2. Outline
          3. Prerequisites
      1. 8.1 Environment basics
        1. 8.1.1 Exercises
      2. 8.2 Recursing over environments
        1. 8.2.1 Exercises
      3. 8.3 Function environments
        1. 8.3.1 The enclosing environment
        2. 8.3.2 Binding environments
        3. 8.3.3 Execution environments
        4. 8.3.4 Calling environments
        5. 8.3.5 Exercises
      4. 8.4 Binding names to values
        1. 8.4.1 Exercises
      5. 8.5 Explicit environments
        1. 8.5.1 Avoiding copies
        2. 8.5.2 Package state
        3. 8.5.3 As a hashmap
      6. 8.6 Quiz answers
    8. Chapter 9 - Debugging, condition handling, and defensive programming
          1. Quiz
          2. Outline
      1. 9.1 Debugging techniques
      2. 9.2 Debugging tools
        1. 9.2.1 Determining the sequence of calls
        2. 9.2.2 Browsing on error
        3. 9.2.3 Browsing arbitrary code
        4. 9.2.4 The call stack: traceback(), where, and recover()
        5. 9.2.5 Other types of failure
      3. 9.3 Condition handling
        1. 9.3.1 Ignore errors with try
        2. 9.3.2 Handle conditions with tryCatch()
        3. 9.3.3 withCallingHandlers()
        4. 9.3.4 Custom signal classes
        5. 9.3.5 Exercises
      4. 9.4 Defensive programming
        1. 9.4.1 Exercises
      5. 9.5 Quiz answers
  6. Section II - Functional programming
    1. Chapter 10 - Functional programming
          1. Outline
          2. Prequisites
      1. 10.1 Motivation
      2. 10.2 Anonymous functions
        1. 10.2.1 Exercises
      3. 10.3 Closures
        1. 10.3.1 Function factories
        2. 10.3.2 Mutable state
        3. 10.3.3 Exercises
      4. 10.4 Lists of functions
        1. 10.4.1 Moving lists of functions to the global environment
        2. 10.4.2 Exercises
      5. 10.5 Case study: numerical integration
        1. 10.5.1 Exercises
    2. Chapter 11 - Functionals
          1. Outline
          2. Prerequisites
      1. 11.1 My first functional: lapply()
        1. 11.1.1 Looping patterns
        2. 11.1.2 Exercises
      2. 11.2 For loop functionals: friends of lapply()
        1. 11.2.1 Vector output: sapply and vapply
        2. 11.2.2 Multiple inputs: Map (and mapply)
        3. 11.2.3 Rolling computations
        4. 11.2.4 Parallelisation
        5. 11.2.5 Exercises
      3. 11.3 Manipulating matrices and data frames
        1. 11.3.1 Matrix and array operations
        2. 11.3.2 Group apply
        3. 11.3.3 The plyr package
        4. 11.3.4 Exercises
      4. 11.4 Manipulating lists
        1. 11.4.1 Reduce()
        2. 11.4.2 Predicate functionals
        3. 11.4.3 Exercises
      5. 11.5 Mathematical functionals
        1. 11.5.1 Exercises
      6. 11.6 Loops that should be left as is
        1. 11.6.1 Modifying in place
        2. 11.6.2 Recursive relationships
        3. 11.6.3 While loops
      7. 11.7 A family of functions
        1. 11.7.1 Exercises
    3. Chapter 12 - Function operators
          1. Outline
          2. Prerequisites
      1. 12.1 Behavioural FOs
        1. 12.1.1 Memoisation
        2. 12.1.2 Capturing function invocations
        3. 12.1.3 Laziness
        4. 12.1.4 Exercises
      2. 12.2 Output FOs
        1. 12.2.1 Minor modifications
        2. 12.2.2 Changing what a function does
        3. 12.2.3 Exercises
      3. 12.3 Input FOs
        1. 12.3.1 Prefilling function arguments: partial function application
        2. 12.3.2 Changing input types
        3. 12.3.3 Exercises
      4. 12.4 Combining FOs
        1. 12.4.1 Function composition
        2. 12.4.2 Logical predicates and boolean algebra
        3. 12.4.3 Exercises
  7. Section III - Computing on the language
    1. Chapter 13 - Non-standard evaluation
          1. Outline
          2. Prerequisites
      1. 13.1 Capturing expressions
        1. 13.1.1 Exercises
      2. 13.2 Non-standard evaluation in subset
        1. 13.2.1 Exercises
      3. 13.3 Scoping issues
        1. 13.3.1 Exercises
      4. 13.4 Calling from another function
        1. 13.4.1 Exercises
      5. 13.5 Substitute
        1. 13.5.1 Adding an escape hatch to substitute
        2. 13.5.2 Capturing unevaluated . . .
        3. 13.5.3 Exercises
      6. 13.6 The downsides of non-standard evaluation
        1. 13.6.1 Exercises
    2. Chapter 14 - Expressions
          1. Outline
          2. Prerequisites
      1. 14.1 Structure of expressions
        1. 14.1.1 Exercises
      2. 14.2 Names
        1. 14.2.1 Exercises
      3. 14.3 Calls
        1. 14.3.1 Modifying a call
        2. 14.3.2 Creating a call from its components
        3. 14.3.3 Exercises
      4. 14.4 Capturing the current call
        1. 14.4.1 Exercises
      5. 14.5 Pairlists
        1. 14.5.1 Exercises
      6. 14.6 Parsing and deparsing
        1. 14.6.1 Exercises
      7. 14.7 Walking the AST with recursive functions
        1. 14.7.1 Finding F and T
        2. 14.7.2 Finding all variables created by assignment
        3. 14.7.3 Modifying the call tree
        4. 14.7.4 Exercises
    3. Chapter 15 - Domain specific languages
          1. Prerequisites
      1. 15.1 HTML
        1. 15.1.1 Goal
        2. 15.1.2 Escaping
        3. 15.1.3 Basic tag functions
        4. 15.1.4 Tag functions
        5. 15.1.5 Processing all tags
        6. 15.1.6 Exercises
      2. 15.2 LaTeX
        1. 15.2.1 LaTeX mathematics
        2. 15.2.2 Goal
        3. 15.2.3 to_math
        4. 15.2.4 Known symbols
        5. 15.2.5 Unknown symbols
        6. 15.2.6 Known functions
        7. 15.2.7 Unknown functions
        8. 15.2.8 Exercises
  8. Section IV - Performance
    1. Chapter 16 - Performance
      1. 16.1 Why is R slow?
      2. 16.2 Microbenchmarking
        1. 16.2.1 Exercises
      3. 16.3 Language performance
        1. 16.3.1 Extreme dynamism
        2. 16.3.2 Name lookup with mutable environments
        3. 16.3.3 Lazy evaluation overhead
        4. 16.3.4 Exercises
      4. 16.4 Implementation performance
        1. 16.4.1 Extracting a single value from a data frame
        2. 16.4.2 ifelse(), pmin(), and pmax()
        3. 16.4.3 Exercises
      5. 16.5 Alternative R implementations
    2. Chapter 17 - Optimising code
          1. Outline
          2. Prerequisites
      1. 17.1 Measuring performance
        1. 17.1.1 Limitations
      2. 17.2 Improving performance
      3. 17.3 Code organisation
      4. 17.4 Has someone already solved the problem?
        1. 17.4.1 Exercises
      5. 17.5 Do as little as possible
        1. 17.5.1 Exercises
      6. 17.6 Vectorise
        1. 17.6.1 Exercises
      7. 17.7 Avoid copies
      8. 17.8 Byte code compilation
      9. 17.9 Case study: t-test
      10. 17.10 Parallelise
      11. 17.11 Other techniques
    3. Chapter 18 - Memory
          1. Outline
          2. Prerequisites
          3. Sources
      1. 18.1 Object size
        1. 18.1.1 Exercises
      2. 18.2 Memory usage and garbage collection
      3. 18.3 Memory profiling with lineprof
        1. 18.3.1 Exercises
      4. 18.4 Modification in place
        1. 18.4.1 Loops
        2. 18.4.2 Exercises
    4. Chapter 19 - High performance functions with Rcpp
          1. Outline
      1. 19.1 Getting started with C++
        1. 19.1.1 No inputs, scalar output
        2. 19.1.2 Scalar input, scalar output
        3. 19.1.3 Vector input, scalar output
        4. 19.1.4 Vector input, vector output
        5. 19.1.5 Matrix input, vector output
        6. 19.1.6 Using sourceCpp
        7. 19.1.7 Exercises
      2. 19.2 Attributes and other classes
        1. 19.2.1 Lists and data frames
        2. 19.2.2 Functions
        3. 19.2.3 Other types
      3. 19.3 Missing values
        1. 19.3.1 Scalars
          1. 19.3.1.1 Integers
          2. 19.3.1.2 Doubles
        2. 19.3.2 Strings
        3. 19.3.3 Boolean
        4. 19.3.4 Vectors
        5. 19.3.5 Exercises
      4. 19.4 Rcpp sugar
        1. 19.4.1 Arithmetic and logical operators
        2. 19.4.2 Logical summary functions
        3. 19.4.3 Vector views
        4. 19.4.4 Other useful functions
      5. 19.5 The STL
        1. 19.5.1 Using iterators
        2. 19.5.2 Algorithms
        3. 19.5.3 Data structures
        4. 19.5.4 Vectors
        5. 19.5.5 Sets
        6. 19.5.6 Map
        7. 19.5.7 Exercises
      6. 19.6 Case studies
        1. 19.6.1 Gibbs sampler
        2. 19.6.2 R vectorisation vs. C++ vectorisation
      7. 19.7 Using Rcpp in a package
      8. 19.8 Learning more
      9. 19.9 Acknowledgments
    5. Chapter 20 - R's C interface
          1. Outline
          2. Prerequisites
      1. 20.1 Calling C functions from R
      2. 20.2 C data structures
      3. 20.3 Creating and modifying vectors
        1. 20.3.1 Creating vectors and garbage collection
        2. 20.3.2 Missing and non-finite values
        3. 20.3.3 Accessing vector data
        4. 20.3.4 Character vectors and lists
        5. 20.3.5 Modifying inputs
        6. 20.3.6 Coercing scalars
        7. 20.3.7 Long vectors
      4. 20.4 Pairlists
      5. 20.5 Input validation
      6. 20.6 Finding the C source code for a function