Objective research to aid investing decisions

Value Investing Strategy (Strategy Overview)

Allocations for January 2025 (Final)
Cash TLT LQD SPY

Momentum Investing Strategy (Strategy Overview)

Allocations for January 2025 (Final)
1st ETF 2nd ETF 3rd ETF

Investing Expertise

Can analysts, experts and gurus really give you an investing/trading edge? Should you track the advice of as many as possible? Are there ways to tell good ones from bad ones? Recent research indicates that the average “expert” has little to offer individual investors/traders. Finding exceptional advisers is no easier than identifying outperforming stocks. Indiscriminately seeking the output of as many experts as possible is a waste of time. Learning what makes a good expert accurate is worthwhile.

Extracting Sentiment Probabilities from LLMs

Generative large language models (LLM), such as ChatGPT, are best known for conversational summation of complex information. Their use in financial forecasting focuses on discrete news sentiment signals of positive (1), neutral (0) or negative (-1). Is there a way to extract more granularity in LLM sentiment estimates? In their October 2024 paper entitled “Cut the Chit-Chat: A New Framework for the Application of Generative Language Models for Portfolio Construction”, Francesco Fabozzi and Ionut Florescu present Logit Extraction as a way to replace discrete LLM sentiment labels with continuous sentiment probabilities and apply results to ranking stocks for portfolio construction. Logit Extraction exploits the inner workings of LLMs to quantify sentiment strength. They test it on four LLMs: Mistral, LlamaChatGpT-3.5 and ChatGPT-4. Their benchmark model is the specialized, quantitative FinBERT. They compare the abilities of each LLM to those of FinBERT in replicating human-assigned sentiment labels and generating long-short portfolio risk-adjusted returns, with and without Logit Extraction. Inputs are initial-release headlines from news alerts covering a single company published from 30-minutes before market open on the previous day to 30-minutes before market open on the day of trading during January 2010 through October 2020. They aggregate headlines published on non-trading days for long-short trading the next trading day. Portfolio trades occur at the open each trading day and are limited to stocks in the news the day before (an average of 46). Using the specified 216,837 news headlines and associated daily returns across 263 unique firms, they find that: Keep Reading

Performance of Barron’s Annual Top 10 Stocks

Each year in December, Barron’s publishes its list of the best 10 stocks for the next year. Do these picks on average beat the market? To investigate, we scrape the web to find these lists for years 2011 through 2024, calculate the associated calendar year total return for each stock and calculate the average return for the 10 stocks for each year. We use SPDR S&P 500 ETF Trust (SPY) as a benchmark for these averages. We source most stock prices from Yahoo!Finance, but also use Historical Stock Price.com for a few stocks no longer tracked by Yahoo!Finance. Using year-end dividend-adjusted stock prices for the specified stocks-years during 2010 through 2024, we find that: Keep Reading

Usefulness of AI Chatbots to Individual Investors

Can a generative artificial intelligence (AI) model, such as ChatGPT 4o, materially aid investors in understanding the implications of earnings conference call transcripts? In their December 2024 paper entitled “AI, Investment Decisions, and Inequality”, Alex Kim, David Kim, Maximilian Muhn, Valeri Nikolaev and Eric So conduct two surveys to explore how generative AI shapes investment decision-making based on anonymous earnings conference call transcripts of publicly traded firms. For the first survey, they: (1) divide participants into sophisticated and unsophisticated groups based on responses to initial questions; and, (2) ask ChatGPT 4o to generate one summary for individuals with little financial knowledge and another summary for individuals with college-level financial knowledge and stock investing experience. They then randomly assign each participant to receive raw conference call transcripts (the control), summaries for sophisticated investors or summaries for unsophisticated investors. They next present each participant with summaries for two distinct but similar firms, one at a time and ask each participant to:

  1. Rate on a scale of -5 to 5 the likelihood that firm earnings will decrease or increase next year, and confidence in the estimate on a scale from 0 to 1.
  2. Evaluate on a scale of -5 to 5 the overall sentiment as negative or positive, and confidence in the evaluation on a scale from 0 to 1.
  3. Allocate a hypothetical $1,000 to the two stocks presented or to cash for either one day or one year.
  4. Write a brief rationale for the asset allocation decision.

They record how much time each participant spends on each task.

For the second survey, they provide some participants with an AI chatbot pre-loaded with earnings call transcripts and some with only the raw transcripts (the control). They study interactions of participants with the chatbot and measure subsequent performances on investment tasks.

Their pool of end-of-fiscal-year earnings conference call transcripts spans 2010 through 2022 for 200 NYSE/NASDAQ stocks assigned to 100 economically similar pairs. Using the selected transcripts and associated 1-day and 1-year stock returns, they find that: Keep Reading

LLM Prompt Snooping Bias?

Data snooping bias entails the capture of noise in a dataset that is lucky with respect to a research goal, such as high Sharpe ratio for an investment/trading strategy. Snooping may involve discovery via multiple tests of a lucky subsample in a time series, a lucky parameter value in a model or a lucky alternative model. Small, noisy samples are especially susceptible to snooping. A researcher may inherit snooping bias by using prior biased research as a starting point for further exploration. In any case, snooped research findings degrade or disappear out of sample.

There is an emerging body of research in financial markets based on exploitation of large language model (LLM) capabilities. This research entails prompt engineering, wherein a researcher develops instructions for an LLM to achieve a goal. In presenting research based on LLM outputs, the researcher may describe in detail the sequence of prompts used to elicit these outputs. However, the researcher may previously have tried many variations of these prompts to improve LLM outputs with respect to the research goal. To the degree that LLM “thinking” is opaque, the level of bias derived from this prompt tuning (snooping) is mysterious.

In summary, investors should be skeptical regarding LLM-based research findings due to the potential for prompt snooping.

Complete Finance Research by LLMs?

Can large language models (LLMs) create financial research? In their December 2024 paper entitled “AI-Powered (Finance) Scholarship”, Robert Novy-Marx and Mihail Velikov describe a process for automatically generating academic finance papers using LLMs and demonstrates its efficacy by producing hundreds of complete papers on stock return predictability. Specifically, they:

  1. Identify 31,460 potential stock return predictors from accounting variables and their differences.
  2. Screen these potential signals for redundancy, data robustness and stock selection breadth to identify 17,074 candidates for validation.
  3. Validate these signals via decile and quintile portfolio sorts and controls for multiple known stock return factors to select 183 promising (alpha) signals.
  4. Apply a series of anomaly evaluation tools to isolate 96 economically meaningful and statistically reliable signals and generate a standardized report for each that details signal performance, including a comparison to over 200 other known anomalies.
  5. Use state-of-the-art LLMs to generate three conventional academic papers for each effective signal (288 reports in total), including:
    • Creative signal names.
    • Abstract.
    • Introductions with motivation, hypothesis development, results summary and contribution.
    • Data and conclusion.
    • Citations to other relevant papers.

Based on results from this process, they conclude that: Keep Reading

LLMs as Quant Tools

How can investors best apply the available array of Large Language Models (LLM) in quantitative strategy development? In his December 2024 paper entitled “The LLM Quant Revolution: From ChatGPT to Wall Street”, William Mann summarizes the use of LLMs in quantitative finance, focusing on: the current state of LLM technology in financial applications; comparison of leading models; implementation frameworks for production systems and risk management; and, quality control considerations. He reports results of a comprehensive review of the best LLM to use for each of seven phases of investment strategy development. Based on the body of research on use of LLMs in finance and this comprehensive review, he concludes that: Keep Reading

Machine Learning Model Design Choice Zoo?

Are the human choices in studies that apply machine learning models to forecast stock returns critical to findings? In other words, is there a confounding machine learning design choices zoo? In their November 2024 paper entitled “Design Choices, Machine Learning, and the Cross-section of Stock Returns”, Minghui Chen, Matthias Hanauer and Tobias Kalsbach analyze effects of varying seven key machine learning design choices: (1) machine learning model used, (2) target variable/evaluation metric, (3) target variable transformation (continuous or discrete dummy), (4) whether to use anomaly inputs from pre-publication subperiods or not, (5) whether to compress correlated features, (6) whether to sue a rolling or expanding training window and (7) whether to include micro stocks in the training sample. They examine all possible combinations of these choices, resulting in 1,056 machine learning models. For each machine learning model each month, they:

  1. Rank stocks on each of 207 potential return predictors and map rankings into [-1, 1] intervals. In case of missing inputs, they set the ranking value to 0.
  2. Apply rankings to predict a next-month target variable (return in excess of the risk-free rate, market-adjusted return or 1-factor model risk-adjusted return) for each stock with market capitalization above a 20% NYSE threshold during January 1987 through December 2021.
  3. Reform a hedge portfolio that is long (short) the value-weighted tenth, or decile, of stocks with the highest (lowest) predicted target variable and compute next-month portfolio return.

Using monthly data as available for all listed U.S. common stocks during January 1957 through December 2021, they find that: Keep Reading

Review of Effects of GenAI on Firm Values and Finance Research

How should investors think about potential shocks  to firm valuations and financial markets research from generative artificial intelligence (GenAI)? In their October 2024 paper entitled “AI and Finance”, Andrea Eisfeldt and Gregor Schubert review the literature on the effects of GenAI on (1) firm valuations and (2) financial research methods. They also offer an introduction to available GenAI research tools and advice on using these tools. Based on the body of research, they conclude that:

Keep Reading

Success Factors for Day Traders?

Despite access to elaborate trading platforms and real-time data, the large majority of speculative traders incur substantial losses (see, for example the chart below). In his August 2024 paper entitled “The Myth of Profitable Day Trading: What Separates the Winners from the Losers?”, Franklin Gallegos-Erazo identifies factors that distinguish the few successful traders from the many who fail, including risk management, emotional control and strategies employed. Based on results of past studies, he concludes that: Keep Reading

Measuring Professional Investor Decision-making Skill

Is detailed decision-making prowess a better metric than past performance for comparing portfolio managers? In their October 2024 paper entitled “Actions Speak Louder Than (Past) Performance: The Relationship Between Professional Investors’ Decision-Making Skill and Portfolio Returns”, Isaac Kelleher-Unger, Clare Levy and Chris Woodcock examine the link between professional investor decision-making and overall performance for long-only stock portfolios involving at least 80 decisions per year. Specifically, they analyze daily positions for each stock to quantify seven decision outcomes: stock-picking, entry timing, scaling in, size adjusting, weighting, scaling out and exit timing. They then aggregate effects of all decisions at the portfolio level relative to prospectus benchmarks or, where none is stated, to a relevant index. They measure added values of decision types as follows (see the figure below):

  1. Stock picking – positive or negative overall return to the position while owned.
  2. Entry timing – proximity of initial entry price to its low from 21 trading days before through 21 trading days after purchase.
  3. Scaling in – comparison of return to a buy-and-hold strategy at average price of the stock from initial entry to first sell trade.
  4. Adding/trimming/no-trade – comparison of return to buy-and-hold at the median quantity from first sell trade to the first sell trade after the last add trade.
  5. Scaling out – comparison of return to a buy-and-hold strategy at average price of the stock from the first sell trade after the last add trade to the total exit.
  6. Position weighting – comparison of return to that for a hypothetical equal-weighted portfolio.
  7. Exit timing – proximity of final exit price to its high from 21 trading days before through 21 trading days after purchase.

They then combine hit rate (fraction of decisions with positive value-add) and payoff ratio (ratio of value-add to value-loss across all decisions) for each investor to compute a Behavioral Alpha (BA) Score, and relate BA Score to current and future portfolio performance. Using proprietary daily holdings of 123 long-only stock portfolios managed by professional investors during 2013 through 2023, they find that:

Keep Reading

Login
Daily Email Updates
Filter Research
  • Research Categories (select one or more)