2 REVIEW OF REGRESSION AND MORE ABOUT R

2.1 GOALS OF THIS CHAPTER

The purpose of this chapter is threefold: (i) to review many basic notions from simple regression (the linear regression model, ordinary least squares [OLS], and the central limit theorem—in this context, basic inference); (ii) to introduce some more advanced features of R (matrix commands, curve fitting, plotting, and “inquiry” functions); and (iii) to introduce the idea of simulating data.

Real data is very important in statistics, but so is simulated data. Simulated data has known characteristics, allowing the student/programmer to examine the performance of algorithms, plots, and formulas in the best- and worst-case scenarios. Simulating data based on formulas and models allows the student/programmer to operationalize the formulas and models, often leading to a more complete understanding of what the formula or model is “saying.” The ability to simulate data allows the student/programmer to quickly check conjectures and produce useful examples and counter examples. It is the opinion of the author that the ability to effortlessly and routinely simulate data is a skill all statisticians should have.

2.2 THE SIMPLE(ST) REGRESSION MODEL

2.2.1 Ordinary Least Squares

Imagine data produced by the following simple model: yk = β0 + β1xk + ϵk.

The errors, ϵk, are normal, have mean zero, have equal spread, and are independent.

Note: A key difference between a traditional statistical problems and a time series problem ...

Get Basic Data Analysis for Time Series with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.