12.10 Correlation Analysis

In the area of correlation analysis, two separate cases or approaches present themselves:

Case A: Suppose X and Y are both random variables. Then the purpose of correlation analysis is to determine the degree of “covariability” between X and Y.
Case B: If only Y is taken to be a random variable and Y is regressed on X, with the values of non-random X taken to be fixed (as in the regression model), then the purpose of correlation analysis is to measure the “goodness of fit” of the sample linear regression equation to the scatter of observations on X and Y.

Let us consider these two situations in turn.

12.10.1 Case A: X and Y Random Variables

In this instance, we need to determine the direction as well as the strength (i.e., the degree of closeness) of the relationship between the random variables X and Y, where X and Y follow a “joint bivariate distribution.” This will be accomplished by first extracting a sample of points (Xi, Yi), i = 1,. . .,n, from the said distribution. Then once we compute the sample correlation coefficient, we can determine whether or not it serves as a “good” estimate of the underlying degree of covariation within the population.

To this end, let X and Y be random variables that follow a joint bivariate distribution. Let: E(X) and E(Y) depict the means of X and Y, respectively; S(X) and S(Y) represent the standard deviations of X and Y, respectively; and COV(X,Y) denotes the covariance between X and Y.2 Then the population ...

Get Statistical Inference: A Short Course now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.