Want your machine to excel in investing? In his January 2018 paper entitled “The 10 Reasons Most Machine Learning Funds Fail”, Marcos Lopez de Prado examines common errors made by machine learning experts when tackling financial data and proposes correctives. Based on more than two decades of experience, he concludes that:
- Organize as an assembly line of experts (with specialists in data curation/processing, computing infrastructure, software development, feature analysis, execution simulation and backtesting). Unlike silos of portfolio managers, teams of experts, with quality independently measured for each task, yield true discoveries at a predictable rate.
- Just 20 walk-forward (“out-of-sample”) backtests repeated on the same data are likely to generate one useless investment strategy that has a naive 95% confidence level. De-emphasize backtesting and instead isolate/analyze important investment strategy features. Look for ways to amplify signals of important features. Eliminate features that simply add noise.
- Markets do not process information at a constant time rate, and time-sampled series often exhibit undesirable statistical properties (autocorrelation and fat tails). Sample data in units of information content such as number of trades, volume of trades or dollar value traded.
- Distributions of returns are reliably stationary (have a stable mean), but have no path memory. Prices have path memory, but are non-stationary. Relying on pathless data invites overfitting to a spurious pattern, a false discovery. Use fractional differentiation as a compromise between returns and prices.
- Do not use testing methods that ignore price paths, thereby missing conditions that would stop out positions (such as margin calls or risk tolerance breaches). Identify and include profit-taking limits, stop-loss limits and loss-of-patience (strategy expiration) limits.
- Simplify strategy analysis by tackling trade direction (long or short) and trade size (risk management) separately in sequence. Trade size analysis learns from the weaknesses of the first and avoids false positives.
- Do not assume that series of test observations are independent and identically distributed, because strategy parameters often rely on overlapping data. Instead, weight each observation as a function of absolute log returns uniquely attributable to it.
- Do not assume that training (in-sample) and testing (out-of-sample) are separable by a simple date. Purge from the training set all observations whose strategy parameter measurements overlap with those in the testing set.
- Simple walk-forward testing specifies a single training subsample that may be unrepresentative and a single test subsample that may be unrepresentative, and is susceptible to overfitting. Introduce robustness by simulating many scenarios based on different training and testing subsample splits within a sample, subjecting every part of the full series to multiple tests.
- Do not ignore backtest overfitting, which generates false positives (strategies that are lucky within the training subsample). To mitigate, count the number of walk-forward trials and use a deflated Sharpe ratio (DSR) for backtest evaluation. In addition to return average and standard deviation, DSR considers non-normality of returns, length of the training sample, number of independent trials and intensity of data snooping.
In summary, investment strategy developers (especially when amplifying testing via machine learning) should be experts in avoiding practices that foster false discoveries.
Cautions regarding conclusions include:
- The paper is a statement of beliefs. The author does not prove that his recommendations are reliably sufficient for discovering persistently successful investment strategies.
- Implementing the recommendations may involve costs to be debited from the performance of strategies so discovered.
See also “Seven Habits of Highly Ineffective Quants”, based on a prior presentation chart version of the source paper.