Chapter 11Regression and Analysis of Variance

11.1 Introduction

In this chapter, we study aspects of statistical relationships between two or more random variables. For example, in a computer system the throughput Y and the degree of multiprogramming X might well be related to each other. One indicator of the association (interdependence) between two random variables is their correlation coefficient c11-math-0003 and its estimator c11-math-0004. Correlation analysis will be considered in Section 11.6.

A related problem is that of predicting a value of system throughput y at a given degree of multiprogramming x. In other words, we are interested here in studying the dependence of Y on X. The problem then is to find a regression line or a regression curve that describes the dependence of Y on X. Conversely, we may also study the inverse regression problem of dependence of X on Y. In the remainder of this section we consider regression when the needed parameters of the population distribution are known exactly. Commonly, though, we are required to obtain a regression curve that best approximates the dependence on the basis of sampled information. This topic will be covered in Sections 11.3 and 11.4.

Another related problem is that of least-squares curve fitting. Suppose that we have two variables (not necessarily ...

Get Probability and Statistics with Reliability, Queuing, and Computer Science Applications, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.