Jason Kelly has occasionally requested further explanation and reconsideration regarding our evaluation of his forecasting record. His requests and our responses follow:
On 9/2/09, Jason Kelly wrote:
I’d like to request a reconsideration of my May 5 call, shown at your site as, “We’re pulling money off the table…” and recently scored a minus. I seem to have been caught between your time frames. The 5-day, 21- day, and 63-day shown are, respectively, 0.5%, 4.3%, and 10.9%.
Looking at the S&P 500’s chart, however, shows the beginning of May to have been almost precisely the top of the easy part of the March rally. It’s where the strongest upward momentum faded out to be replaced by choppiness all the way to early July, when another surge came. That next surge came a full seven weeks after my call, which is pretty long, and during which the market fell on four occasions below where I said to take out money:
- mid May -2.3%
- late May -1.9%
- late June -1.2%
- early July -2.8%
Given that Jack Schannep’s Feb. 19 call “suggesting a further bailing out of the stock market to just a 33% invested position” was scored a plus for a 21-day return of -1.3% followed by a 63-day 16.6% and a 126- day 27.1%, it seems that the more dangerous times following my suggestion to take money off the table also warrant a plus for the call. There were, after all, four good opportunities to put it back to work at lower prices.
Our response:
The following chart depicts the calls in question. The red circle denotes your 5/5/09 comment “We’re pulling money off the table.” The blue circle denotes Jack Schannep’s “suggesting a further bailing out of the stock market to just a 33% invested position…” During the seven-week interval you cite, there are several occasions above where you said to take money out:
- early May +2.8%
- early to mid June +3.9% to +4.7% over a two-week period
- late June +2.6%
The seven-week interval is modestly choppy with greater upside volatility than downside volatility. A more precise comment, such as “We’re pulling money off the table to re-enter on any downside move of at least 2%…” would have been convincing. A recommendation to exit in early June and re-enter in early July would have been impressive. Based on the data, the grade for the 5/5/09 comment stands.
In general, forecast grading includes looking at the record of daily index closes for confirmation of the set-interval performance. Hence, consideration given to Jack Schannep’s exit call in front of a double-digit drop during late February. This grade is a difficult one, indicative of limitations of the Guru Grades approach, because there is no public record indicating whether he exploited this timely exit by re-entering at a lower point.
In general, the combination of forecast vagueness and stock market volatility makes grading especially difficult.
On 3/17/09, Jason Kelly wrote:
I noticed that your 5-day performance after my 2/24/09 call for a rally tracks from the close of the 24th. That’s incorrect because my article was published on the 24th Japan time, which was the 23rd U.S. time. The call tracked from the close of the 23rd shows a 5-day result of -5.7% from S&P 500 743 to 701, instead of the current -9.9% from 773 to 696.
Our response:
The timestamp on your 2/24/09 commentary is 7:38 PM Japan time, corresponding to 12 or 13 hours earlier for the U.S. east coast (early morning ET on 2/24/09).
Guru Grades uses close-to-close levels of the S&P 500 index throughout for intervals after the publication date (for your case “over the 5, 21, 63 and 254 trading days after the publication date for each item.”)
There is some looseness in this methodology because it does not take into account the publication timestamp of a commentary and the intraday (or opening) level of the index for that timestamp. The extreme case of looseness would be the different treatment of two forecasts published just before and just after midnight (ET). In fact, many forecasters do not use timestamps. The grading of individual forecasts that focus on the short term takes into consideration this looseness in methodology.
Revising the stated methodology to evaluate based on timestamp as well as date would require a large effort to review retroactively all those sources using timestamps. Such a revision would probably not have any effect on grading of forecasts.
On 3/19/09, Jason Kelly wrote:
For this one case, then, would you mind changing the date of my commentary to 2/23 to properly track it? If anybody inquires, which I doubt will happen, it’s easy to explain why the date is different. It’s also honest. The commentary was issued on that date partly in reaction to the big drop on that Monday.
Our response:
As noted in the prior response, you issued your commentary on 2/24/09 ET, so it would not be honest to present it with a 2/23/09 date. Tracking S&P 500 index returns after this forecast from the close on 2/23/09 would conflict with the stated methodology of Guru Grades. As also previously noted, the grading of individual forecasts that focus on the short term takes into consideration the looseness in the timestamp-free methodology. The more specific the forecast is (e.g., “the stock market will rise dramatically on 2/24/09”), the smaller the role of judgment in grading the forecast.
On 5/15/08, Jason Kelly wrote:
I was surprised to see my Feb. 6 comment rated a minus on your site.
Here’s the excerpt you show: “I’m expecting lower prices before we see higher…”
The article mentioned that I’d thought the rise from January was a bear market bounce, that it was, and that we were still in a down market. Your tracking shows a 21-day return of -2.5%. Here’s a more representative excerpt from the article:
“The Kelly Letter has been setting limit prices low to catch bargains on spikes down, and we still have buying to do. That means I’m expecting lower prices before we see higher, but I’m not worried that we’ll never see higher. Know what you want to buy, set a cheap price, and let the market wander its path.”
At the March lows, the market had fallen 4% from the time of the comment. The Kelly Letter added positions to the portfolio at prices much lower than they were on Feb. 6. The call to hold back for better prices, reiterated in the Mar. 7 commentary, was prescient. Moreover, a call to buy in February and early March should be seen as a mistake ahead of the second big dip so far this year. Conversely, my calls to hold back were helpful.
Would you consider changing the Feb. 6 minus to a plus?
Our response:
We reviewed your 2/6/08 commentary in the following more analog context than that presented in the summary table:
The market went higher before it went lower, contrary to your wrap-up statement. However, it was correct to wait for lower for more buying. We changed the grade to a “0”, meaning right and wrong elements in the forecast.
On 5/4/08, Jason Kelly wrote:
I feel that the excerpt you show for my Mar. 16 commentary doesn’t reflect the article. It was a call to buy, not sit on the sidelines. But the excerpt you show is the part where I explain why I wouldn’t go all in yet because I was waiting for some lower prices. Nonetheless, the article was a qualified call to buy, not a qualified call to sit on the sidelines.
Here’s what you currently show:
“I would feel much more comfortable buying now if we’d seen a selling climax to an even lower level on huge volume, extreme readings on the MACD and relative strength, and a few more big bank failings. It seems that all of that could lie dead ahead, and that the difference between buying now and buying in a couple of weeks could be significant.”
Here’s what I consider to be the main message:
“… the ratio of the VIX to the yield on the 10-Year Treasury Note “is currently at levels seen only during extreme crisis or panic market environments.”
SentimenTrader examined other times when the ratio has been at current levels and found that the S&P 500 rose an average of 2.0% in the next 5 days, 3.7% in the next 10 days, 4.2% in the next month, and 10.6% in the next 3 months.
The percentage of times that each time frame registered a gain instead of a loss was impressive, too: 85% of 5-day periods, 92% of 10-day periods, 77% of 1-month periods, and 92% of 3-month periods.
Here’s what all that tells me:
> Don’t short, don’t hedge.
> Don’t sell what you already own.
> Look to buy during this time.”
From the above, would you consider replacing your current excerpt with the part from “Here’s what that tells me”?
Our response:
We have expanded the excerpt to include the “Here’s what that tells me” bullets plus the original excerpt and additional context.
There is equivocation in the commentary. You qualify “Look to buy during this time” with a cautionary and more specific “the difference between buying now and buying in a couple of weeks could be significant.” In other words, “During this time” appears to encompass at least a couple of weeks delay for assurance against more Bear Stearns-like episodes.
The most cogent statements in terms of reader actions are: (1) “I can’t say to just wade right in and buy with all you’ve got…” and (2) “Buy in thirds if you’re nervous, in halves if you’re not.” Conversely, the latter advice says keep two-thirds or half your cash in reserve for a better buying opportunity.
On 4/18/08, Jason Kelly wrote:
I was surprised to see my 1/7/08 forecast given a minus on your Guru Grades page. It is a short excerpt from my 2008 forecast that reads:
“The market is in for more volatility until the Fed’s rate cuts soak into the economy. The good news is that the Fed started cutting four months ago. The bad news is that it hasn’t cut enough yet. It will catch up. I expect that we’ll see some fine progress this first quarter.
Stocks are reasonably priced. The earnings yield of the S&P 500 is 40% higher than the yield on the 10-Year Treasury. In all six other periods when stocks were this cheap compared to bonds, the market was higher one year later.”
I said we would have more volatility until rate cuts soak in. That’s been true. I said we’d make fine progress on finally getting the rate cuts we need in the first quarter. The Fed funds rate was lowered 2% in Q1. That’s fine progress. Finally, as you can see from the second paragraph, the excerpt was part of a bigger look at where we should expect stocks to be in the longer term. While the fine progress in Q1 was about interest rates, the benefit to stock investors wasn’t expected to be immediate.
Would you consider expanding the excerpt on your site, and awarding it a plus?
Our response:
After re-reading the commentary, we accept the interpretation that “fine progress” is about the stock market rather than the Federal Funds Rate. We do not grade stock market gurus on predictions for the Federal Funds Rates or other economic indicators
We have replaced the forecast excerpt with the second paragraph as the principal stock market forecast and will reserve judgment until a year from the commentary date.
On 10/1/07, Jason Kelly requested:
…you missed a couple of the bullish notes posted over the past few weeks…I’d like to get the following tracked…
From 9/25 – The End Is (Not) Near
“…I continue to believe that we’re not entering a recession, that the housing “meltdown” just means a great time to be buying a housing stock, and that any weakness in October is another chance for those slow on the uptake to get their money into the market.”
From 9/19 – Did You Miss The Rising Market?
“What I would understand as quickly as possible, however, is that the market is poised for a solid performance in the medium term. If you’re still stuck on last month’s headlines about sub-prime and shaky credit markets, you’re looking in the wrong direction on your calendar. Flip forward, not back. It won’t be long until this silly little correction isn’t even talked about, and it won’t rate anywhere near the top of the issues successfully faced down by the stock market.”
Finally, I would like to point out some fine-line walking needed to see the correctness of one call still ungraded. On 8/29, you quote me thusly: “We’re getting the sale I expected…but it’s not over yet. …the wait goes on.” Then, the 5-day and 21-day result columns show +1.0% and +4.3% respectively. At face value, it looks as if the call was wrong. However, it was right. Notice that on 9/10, I switched from bearish to bullish. From 8/29 to 9/10, the S&P 500 fell 0.82% from 1463.76 to 1451.70. At first blush, this seems trivial, but the main takeaway is that I was searching for the bottom for subscribers, and I found it by waiting that slight extra time hinted at in the 8/29 call. Note that 9/10 was indeed the market bottom in that condensed time frame. From there, it’s gone only up, making the 8/29 call to wait a little longer coupled with the 9/10 call to dive in a near-perfect one-two punch. Because the success of the 8/29 call falls between the pre-set time frames of your system, I’m afraid it may appear to have been a mistake to those looking only at the performance columns.
Our response:
Guru Grades has not reacted to some of your recent articles because of a methodological problem that arises when forecast sampling frequency exceeds forecast testing frequency. When the sampling frequency is larger, the testing intervals overlap and are therefore not independent. In other words, consecutive forecasts use the same test data over and over. This problem makes the sample look larger than warranted by the forecast content and skews the overall accuracy rate toward (away from) periods when an expert issues forecasts at an unusually high (low) frequency. The risk of skew is greatest when an expert’s forecasting frequency varies considerably, as does yours.
The availability of new information that might make an expert adjust the current forecast is counterbalance to this frequency mismatch problem. An expert might write frequently during a period when new information is causing changes in the forecast.
The minimum Guru Grades test interval is five trading days, partly based on the assumption that market conditions do not shift fast enough to warrant forecast changes at any higher frequency and partly based on the publishing frequencies of the more prolific experts. When experts increase forecast frequencies to greater than once a week, based on the above reasoning, Guru Grades generally reverts to sampling rather than testing every forecast. Exceptions are possible based on forecast content.
Recently you have been publishing articles more frequently than once a week, mostly reiterating a positive intermediate-term outlook. Based on the above considerations, Guru Grades uses the articles most focused on the stock market at a frequency not exceeding once a week. Given an intermediate-term forecast (three to six months?), even a weekly sampling frequency may be inappropriately high. We are therefore declining your request to include additional forecast commentary.
Regarding your concern that your 8/29/07 commentary might be misperceived or misgraded based on the Guru Grades methodology… When forecast commentary relates to a near-term turning point, we check the daily performance of S&P 500 index over subsequent days to judge the accuracy of the forecast. Any readers of cxoadvisory.com motivated enough to review detailed forecasts would probably take the same approach.
Note that the degree of judgment required in sampling forecast commentary at Guru Grades declines with both the regularity and specificity of the commentary. Also, the degree of judgment required in measuring the accuracy of a given forecast declines with the specificity of that forecast.
On 11/25/06, Jason Kelly requested that we reconsider the grade for a specific item in his record:
A reader pointed out to me that I received a minus for my 2/6/2005 forecast “…keep the bulk of your money in the reliable Dow. You’ll do fine.” Since then, the Dow has gained 16.5%, and all but the 63-day return on your scorecard are positive. On that evidence, would you consider changing that one to a plus?
Our response:
The excerpt your reader cites is admittedly vague with respect to benchmark (“fine”) and timing (“will be”) and therefore difficult to grade. We revisited your entire 2/6/05 commentary.
For context, its title is “The Bull Is Back,” and in the last paragraph you note: “I hope you’re enjoying the dramatic change of sentiment in the market as much as I am. Remember that stuff I said at the beginning of this article about the bad news needing to be brought up? Forget it.” This context seems fairly bullish and at least an implicit invitation to commit funds to stocks.
The following chart compares the cumulative return of the S&P 500 index (the DJIA would be very similar) and T-bills (representing the risk-free rate) from 2/7/05-2/6/06. It shows that much of 2005 offered no risk premium for stocks. Returns across March and April from the 2/7/05 starting point were especially poor.
Given the overall bullish context of the commentary and the performance of the market over the ensuing months, we are not inclined to change the grade but will expand the excerpt to capture additional context.
Over long enough horizons, the excerpt cited by your reader is almost a truism. Buy-and-hold of any reasonably diversified set of stocks has worked pretty well as an investment strategy, regardless of daily, weekly, monthly or yearly fluctuations. One purpose of Guru Grades is to measure whether experts (and ordinary investors by listening to them) can beat a buy-and-hold strategy.
Your reader is right to question the judgment of any source of investing information. We post full grading data so that readers can decide for themselves whether the grading judgments are correct and useful, find and read full commentaries, challenge the grading as your reader has, and ultimately adjust the grading according to their own judgments.
On 10/31/06, Jason Kelly requested that we consider adding two items to his record:
I’d like to get some of the bullishness from two of my recent articles (from 9/18/06 and 10/14/06) onto my as-yet ungraded list. I was calling for a 4th quarter rally from the beginning of the year, and got back to it in late-September. The market has done well since then, of course, and I’d like to get some pluses.
Would you mind re-reading them to see if you can find some actionable tidbits worth putting in my record?
Our response:
We re-read your 9/18/06 entry and included an excerpt in your commentary summary.
While the tone of your 10/14/06 is upbeat, it contains no overall market forecast. It opens with a retrospective, continues with a detailed retrospective and concludes with an outlook for the technology sector. We do not generally track sector forecasts.
On 5/25/06, Jason Kelly requested that reconsideration of the grades on several items in his record:
My own tracking of my U.S. market calls based on your system…turned up mostly positive results. In your language, most looked to have been “essentially right” and therefore deserving of the plus mark.
Here’s what I have:
12/19: “You don’t have to worry about the stock market. Ignore the naysayers, everything is fine.” S&P 500 on 12/19 was 1260: 5-day -0.2%…21-day +0.1%…63-day +3.6% [+]
1/16: “It does seem prudent…to curb our enthusiasm for this great start to the new year…” S&P 500 on 1/13 (market closed on 1/16 for MLK day) was 1287.6: 5-day -1.9%…21-day -0.9%…63-day -0.2% [+]
2/26: “The problem is that street estimates for the second half of the year, while lower than late last year, are still not low enough. That leaves the market ripe for downside earnings surprises, and sinking stock prices. We’re not there, yet, and nothing terrible is imminent. I want to be clear that I’m not forecasting a crash next month.” S&P 500 on 2/24 was 1289.4: 5-day -0.2%…21-day +1%…March +0.3% [+]
4/10: “More Good Times Ahead” S&P 500 on 4/10 was 1296.6: 5-day +0.8%…21-day +2% [+]
5/1: “More Upside” S&P 500 on 5/1 was 1305.2: 5-day +1.5% [+] NOTE: it’s probably too early to grade this.
5/8: “There seems to be no need yet to rush for the exits ahead of a summer correction.” S&P 500 on 5/8 was 1324.7: 5-day -2.3% [-] NOTE: it’s probably too early to grade this
From the forecasts with enough time following them to provide at least the 5-day and 21-day results, I get four plus marks. If I expand to include the two most recent with only 5-day results, I get five plus marks and one minus mark.
In either scenario, my percentage should have increased from the January report. However, it dropped a percentage point.
I consider most of these to be relatively unimportant, small market comments. My main forecast for this year, which comes up repeatedly in my articles so far, is for a big correction in the August/September time frame, followed by a strong end to the year and a good beginning to 2007. Of course, it’s too soon to know how that will go.
Nonetheless, my smaller forecasts look to me to have been on-target.
Our response:
Here are your grades since the first of the year:
The recent change in your accuracy rate came from the grades for 3/26/06 and 5/8/06. The 3/26/06 item is somewhat vague, but essentially right. We would not have graded the 5/8/06 item until later except for your comment about “…no need to rush for the exits…”
The ungraded forecasts depend mostly on your 6-9 month outlook, so they remain ungraded. The 2/4/06 and 3/12/06 items are difficult to grade. The 3/12/06 item may remain ungraded; it is not really a forecast. The 2/4/06 item may also remain ungraded.
On 3/2/06, Jason Kelly wrote:
Thank you for your continued monitoring of my performance, along with other gurus that you follow on your site.
I noticed in the latest update that my accuracy dropped to 70%, putting me at #2 on the list. I checked my report card on your site and found no changes there to account for the lowered percentage. Further, my own tracking shows that I should have improved based on getting Japan right and mid-February’s sinking U.S. market right as well.
Please let me know the reason for the revised percentage. If there’s a mistake, I’d like to fix it. Accurate forecasting is important to me, of course, and I want to be sure that the listing accurately reflects my record.
Our response:
Readers may focus just on the table at Guru Grades, but there is also qualifying discussion there, including:
1st paragraph: The results encompass “reviews of the public U.S. stock market forecasts of various investing/trading experts.” Many experts forecast specific stocks, international markets and other asset classes, but we cannot track all these markets and prefer to focus on U.S. stock market forecasts for comparability.
2nd paragraph: “The statistics here are therefore more current than those available in past blog entries that assess the forecasting records of individual market experts…” I sweep through the forecasts of all experts tracked and do scoring updates about once a week. These updates may shift the accuracy rate for an expert by 0-2% depending on accumulated sample size. My sense is that readers would not be very interested in scores of updates each month to past blog entries, but I will consider whether I could do “silent” updates more often.
Third bullet under the table: “We make a few exceptions for commentaries that include no market direction forecast but do offer some other significant and easily testable market-related prediction or recommendation.” We limit these exceptions to specific forecasts from individuals for whom sample size is very small. Such exceptions dampen comparability among experts, and it is time-consuming to assess forecasts for specific stocks, markets and asset classes not routinely tracked.
Since you post weekly and often comment on the near-term future direction of the U.S. stock market, we update your accuracy rate perhaps once every two weeks. Your sample size is modest (about 50), so each new + or – affects your overall accuracy rate by 1-2%. Here are the forecast excerpts from your commentary since the beginning of 2006:
Scoring your 1/16 and 1/23 comments arguably involves judgment. We take the perspective of a somewhat nervous investor/trader looking for signals to increase/decrease equity exposure and assess whether a comment would make such an individual lean the right way or the wrong way.
We will not score some of your most recent forecast comments until later in the year based on the time frames you specify in them.
Jason Kelly follows up:
Thank you for the thorough reply. I’ll leave the judgment calls up to you. I would like to point out, though, that my 1/23/06 forecast was in regards to my specific portfolio holdings and as such was correct. For instance, Intel lowered guidance and dropped, as did McAfee and Dell. All three fell just as I suggested and we did indeed buy more shares at the lower prices. Would you consider changing that – to a +?
I see that the focus on the U.S. market leaves my Japan calls out of the results. That’s a bit disappointing as my recent calls on Japan have been spot-on. Still, I understand that your time is limited and you must focus on just the U.S. market.
I’m learning from this experience. I’ll be more clear in my future postings on the web site. One point of ambiguity for an outside auditor is that my public postings are copied from notes to subscribers. Because I don’t include specific holdings in the public postings, commentary on our portfolio can sometimes look like wider market commentary. To wit, the 1/23/06 post confusion. I don’t show holdings such as Intel, McAfee, and Dell publicly because I want people to subscribe to get that level of information. So, from here on, I’ll strive for better indicators of when I’m talking about the broad market and when I’m referring to Kelly Letter holdings and/or targets.
I’ll keep at it and, as always, strive to improve my performance. I’d like to get back on top. I’m shooting to achieve an 80% accuracy.
Our follow-up response:
The overall sentiment is negative in your 1/23/06 commentary. Instead of using the excerpt from the paragraph discussing your specific holdings, we will switch to one from the more general preceding paragraph, which states: “Now the reports are coming in, they’re showing an earnings slowdown, and prices are reacting predictably.” Your general reasoning on earnings leads to your expectations on lower stock prices soon. Also, as noted in the introductory discussion at Guru Grades, the methodology restricts reviews to publicly available material; it would be inconsistent to use information that is not available to a member of the public shopping for an expert.
As always, we invite readers to make their own judgments.
We appreciate Jason Kelly’s taking the time to monitor Guru Grades and his way of expressing any disagreement professionally.