CHAPTER 11

VALIDATION OF REGRESSION MODELS

11.1 INTRODUCTION

Regression models are used extensively for prediction or estimation, data description, parameter estimation, and control. Frequently the user of the regression model is a different individual from the model developer. Before the model is released to the user, some assessment of its validity should be made. We distinguish between model adequacy checking and model validation. Model adequacy checking includes residual analysis, testing for lack of fit, searching for high-leverage or overly influential observations, and other internal analyses that investigate the fit of the regression model to the available data. Model validation, however, is directed toward determining if the model will function successfully in its intended operating environment.

Since the fit of the model to the available data forms the basis for many of the techniques used in the model development process (such as variable selection), it is tempting to conclude that a model that fits the data well will also be successful in the final application. This is not necessarily so. For example, a model may have been developed primarily for predicting new observations. There is no assurance that the equation that provides the best fit to existing data will be a successful predictor. Influential factors that were unknown during the model-building stage may significantly affect the new observations, rendering the predictions almost useless. Furthermore, the correlative ...

Get Introduction to Linear Regression Analysis, 5th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.