O'Reilly logo

Basic Data Analysis for Time Series with R by DeWayne Derryberry

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

APPENDIX B AIC IS PRESS!

B.1 INTRODUCTION

The model selection problem in linear regression involves choosing a best model (or at least some best models) from a group of candidate models. One approach being to pick the model that most closely fits the data would appear to involve picking the model with the lowest , but there is an obvious flaw. There is an over-fitting bias in the way SSE is estimated; the criteria used to fit a model (maximum likelihood) minimizes SSE. In particular, if explanatory variables are sequentially added to a regression model, SSE will always shrink, suggesting the absurd conclusion that the most complex model always the one that should be chosen. Put another way, using the same data to both estimate the parameters and subsequently assess fit can produce ever more optimistic estimates of model fit with each additional parameter.

One approach to the model selection problem involves constructively re-stating it as that of picking the model with the best SSE once a correction is made for over-fitting bias. A measure much like SSE that reflects how well the current model might fit fresh data is desired.

B.2 PRESS

Allen (1971, 1974) presents a useful approach to correcting for over-fitting bias in SSE based on a cross-validation idea: . Computation of PRESS ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required