3.12. Sampling on the Dependent Variable

The logit model has a unique sampling property that is extremely useful in a variety of situations. In the analysis of linear models, sampling on the dependent variable is widely known to be potentially dangerous. In fact, much of the literature on selection bias in linear models has to do with fixing the problems that arise from such sampling (Heckman 1979). That’s not true for the logit model, however. You can do disproportionate stratified random sampling on the dependent variable without biasing the coefficient estimates.

Here’s a simple example. Table 3.3 shows a hypothetical table for employment status by high school graduation. The odds ratio for this table is 570×52/(360×22)=3.74. If we estimated ...

Get Logistic Regression Using SAS®: Theory and Application now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.