Are large language models such as GPT-4 as effective as professional human analysts in interpreting numerical financial statements? In their May 2024 paper entitled “Financial Statement Analysis with Large Language Models”, Alex Kim, Maximilian Muhn and Valeri Nikolaev investigate whether GPT-4 can analyze standardized, anonymized financial statements to forecast direction and magnitude (large, moderate or small) of changes in future firm earnings and provide the level of confidence in its answer. They withhold management discussions that accompany financial statements, choosing to evaluate the ability of GPT-4 to analyze only numerical data. They anonymize statements by omitting firm names and replacing years with labels (t, t − 1, …) so that GPT-4 cannot use its training data to find actual future earnings. They consider both a simple query and a series of prompts designed to make GPT-4 think like an ideal human analyst by focusing on changes in certain financial statement items, computing financial ratios and generating economic interpretations of these ratios. They compare GPT-4 forecasts to: (1) consensus (median) human earnings forecasts issued during the month after financial statement release; and, (2) forecasts from other benchmarks, including that of a highly focused state-of-the-art artificial neural net (ANN) model. To test economic value of forecasts, they each year on June 30 form portfolios using GPT-4 forecasts based on annual financial statements from the preceding calendar year end, as follows:
- Sort stocks based on GPT earnings forecasts.
- Select stocks expected to have moderate/large increases or decreases in earnings and separately resort these two groups based on forecast confidence.
- Form an equal-weighted or value-weighted long (short) portfolio of the tenth, or decile, of these stocks with highest confidence in earnings increases (decreases).
Using financial statements for 15,401 firms during 1968 through 2023 (with 2022 and 2023 out-of-sample with respect to the GPT-4 training period), annual returns of associated stocks and consensus human analyst earnings forecasts for 3,152 firms during 1983 through 2021, they find that: Keep Reading