A prediction problem begins with the observation of n independent vectors, the “training set”,
where xj is an N-vector of predictors and yj a real-valued response. Using the training set, the goal is to construct an effective prediction rule r(x): having observed a new vector x but not y, it is hoped that r(x) will accurately predict y. An insurance company, for instance, might collect predictors (age, gender, smoking habits) and be interested in predicting imminent heart attacks.
Classical prediction methods depend on Fisher’s linear discriminant function. Here the response variable is dichotomous,