SELECT A SAMPLE FREE FROM SURVIVORSHIP BIAS

Since all backtest research is performed on a data set that looks back in time, the entire history of an observation will not be available if it does not survive the present. The sample that researchers can work with is a set of observations that have been preselected through time by some common denominators. A sample of a sample should not pose a problem if the subset is selected randomly. But this is not the case for most samples, which suffer from survivorship bias. The bias becomes relevant if the common bond to survive the observation is related to the pattern for which we are looking. A finding of a statistically significant pattern merely reflects the underlying common bond that was used to construct the testing sample.

One typical point of interest, which is severely affected by the survivorship bias, is performance comparison. By only looking at the portfolios currently outstanding, it is obvious that portfolios that did not survive through time due to poor performance are excluded from the sample. By design, the sample only contains good portfolios. How can the true factors that have caused the bad performance ever be identified?

Commercial data vendors are not helping on this issue. Because of cost consideration, most data sets are only provided on a live basis. That is, for a currently non-existent sample observation, the common practice is to delete its entire history from the data set. To simulate the true historical situation, ...

Get Equity Valuation and Portfolio Management now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.