Is there a tractable way of estimating the level of data snooping bias in investment strategy studies and thereby correcting for it? In their April 2018 paper entitled “Detection of False Investment Strategies Using Unsupervised Learning Methods”, Marcos Lopez de Prado and Michael Lewis summarize and validate an approach for estimating snooping bias derived from backtesting multiple strategies on the same data and using that estimate to correct for the bias. The approach involves estimating the overall scope and dispersion of multiple backtests based on correlation clusters within known backtests. Focusing on Sharpe ratio as the key performance metric, they validate their approach via Monte Carlo simulations. Based on derivations and simulations, they conclude that:
- The probability that the Sharpe ratio observed in a single investment strategy backtest exceeds a target Sharpe ratio: increases with the observed Sharpe ratio; increases with sample duration; increases with return distribution skewness; and, decreases with return distribution kurtosis (fatter tails).
- Data snooping bias increases with the number of investment strategies backtested on a dataset, and there is a mathematical estimate of the expected maximum Sharpe ratio for a specific number of backtests.
- For example, for 1,000 backtests on the same data, the expected maximum Sharpe ratio is about 3.3 when the true Sharpe ratio is zero. Monte Carlo simulation confirms this mathematical estimate.
- Unless the actual maximum Sharpe ratio for a set of backtests exceeds the expected maximum, the strategy associated with the actual maximum is likely a false positive.
- While the actual number of backtests underlying a claimed anomaly may be unknown or underestimated, recasting a known set of backtests into correlation clusters reasonably estimates an effective total number of independent backtests. Monte Carlo simulation confirms this approach to estimating level of bias for use in accounting for data snooping.
In summary, the proposed method or estimating the level of, and correcting for, data snooping bias in investment strategy backtests is tractable and reasonably accurate.
The authors assert that finance journals should stop accepting papers that do not account for multiple testing and report the probability that a claimed financial discovery is a false positive.
See “A Practical Solution to the Multiple-Testing Crisis in Financial Research” for detailed illustration of the proposed method.
Cautions regarding conclusions include:
- While tractable mathematically and readily translated to computer code, the methods described are fairly complex from the perspective of individual investors seeking to discover private strategies.
- The approach described requires maintaining records for a sizable number of strategy backtests to generate inputs for the mathematical formulas.
See also “Fixing Empirical Finance”, “Sharper Sharpe Ratio?” and “Measuring Investment Strategy Snooping Bias”.