Snooping for Fun and No Profit
October 27, 2014 - Big Ideas
How much distortion can data snooping inject into expected investment strategy performance? In their October 2014 paper entitled “Statistical Overfitting and Backtest Performance”, David Bailey, Stephanie Ger, Marcos Lopez de Prado, Alexander Sim and Kesheng Wu note that powerful computers let researchers test an extremely large number of model variations on a given set of data, thereby inducing extreme overfitting. In finance, this snooping often takes the form of refining a trading strategy to optimize its performance within a set of historical market data. The authors introduce a way to explore snooping effects via an online simulator that finds the optimal (maximum Sharpe ratio) variant of a simple trading strategy by testing all possible integer values for strategy parameters as applied to a set of randomly generated daily “returns.” The simple trading strategy each month trades a single asset by (1) choosing a day of the month to enter either a long or a short position and (2) exiting after a specified number of days or a stop-loss condition. The randomly generated “returns” come from a source Gaussian (normal) distribution with zero mean. The simulator allows a user to specify a maximum holding period, a maximum percentage stop loss, sample length (number of days), sample volatility (number of standard deviations) and sample starting point (random number generator seed). After identifying optimal parameter values on “backtest” data, the simulator runs the optimal strategy variant on a second set of randomly generated returns to show the effect of backtest overfitting. Using this simulator, they conclude that: Keep Reading