Saturday, June 8, 2019

The most important chart for long-term investors

Time is the investor's best friend. The longer the investment horizon, the less the investment returns depend on factors such as crashes and current valuation levels. It is known that the chance for losing in the stock market on a 20-year period has historically been about zero. This post attempts to expand on this fact and take a look at how risky the U.S. stock market has actually been for long-term investors.

As usual, we'll use data from Robert Shiller to answer the questions. The data begins from the year 1871, long before the actual S&P 500 index was created. We'll only consider lump-sum investing, since dollar cost averaging is another story.

Let's first look at the inflation-adjusted returns for an U.S. investor, including reinvested dividends.  Keep in mind, that the U.S. stock market has been one of the best performing in the world, and future returns are likely to be lower because of high valuations and lower productivity and population growth. The upper and lower bands are the 95 percent prediction intervals, i.e. 95 percent of the time the investment return has been between these bands. The y-axis tells how many times your investment would have been multiplied. Notice that the axis is logarithmic.

This chart demonstrates how uncertain investing is. The range of outcomes is very large, but it doesn't necessarily tell the full truth. France once had a 66-year period where stocks didn't beat inflation, for Italy the longest streak was 73 years and for Austria a painful period of 97 years. This is why global diversification is important. There has however been a 5 percent chance that the investment would have increased 64-fold in the U.S. for the same period. The risk works both ways.

Let's also look at the nominal, non-inflation-adjusted returns to see how inflation eats returns:

The inflation in the U.S. has been quite high, over three percent annually. Inflation of course affects different companies in a different way, but the net effect is that lower inflation does not necessarily lead to higher inflation-adjusted returns.

Be sure to follow me on Twitter for updates about new blog posts!

The R code used in the analysis can be found here.

The original post had a problem with the calculation of dividends. The charts and code have now been updated, and the true returns were higher than in the original post. Sorry for the inconvenience.

Tuesday, January 29, 2019

Correlation analysis of cyclically adjusted valuation measures and subsequent returns

In this post we'll test three different cyclically-adjusted valuation measures: CAPE (earnings), CAPD (dividends) and CAPB (book value). CAPE is calculated like the P/E ratio, but by dividing the current real price with the last ten year's average inflation-adjusted earnings. CAPD uses dividends instead of earnings, and CAPB uses book value. We'll test the optimal measurement (i.e. forward-looking return) period and formation period for all three valuation measures by calculating correlations with the future returns.

Typically CAPE, also known as P/E10, is calculated by using a 10-year formation period. The maximum time period for both the formation period and measurement period we'll use is 30, which means that  the performance of for example P/E1-P/E30 will be tested by looking forward 1-30 years.

We'll use Shiller and Goyal data from the US, which both begin from the year 1871. We'll plot the measurement period (in years) on the x-axis and r-squared on the y axis, and we'll make a distinct line for each of the formation periods.

As you can see from the plot above, a measurement period of about ten years, or maybe a little more, has worked the best for CAPE. As expected, valuation measures don't do a good job explaining short-term returns. However, this also applies to long-term returns, which gives the lines a bell-curved shape. The lines with shorter formation periods are lower than the rest, which means that short-term valuation measures such as normal trailing twelve-month P/E also don't work as well as the long-term valuation measures.

For CAPD, the correlation turns positive at long measurement periods, which is rather unwanted. The better performance of the long-term and the worse performance of the short-term valuation measures are more apparent with CAPD.

For CAPB, longer measurement periods of about twenty years seem to work the best. The r-squared is much larger than with CAPE or CAPD. Even the worst formation periods seem to work better in explaining future returns than the CAPE with the best formation period. This is consistent with Keimling's research (pdf, page 16), which suggests that normal P/B is almost as strong in predicting future returns as CAPE. The plot above shows that the cyclically-adjusted P/B is even stronger than CAPE in predicting future returns.

The reason why the r-squared of the CAPE is lower than what is often quoted is because of the long time period of the data. As you can see from the plot below, the rolling 10-year correlation of CAPE and subsequent returns has been rising over time. 

Another way of viewing these correlations is bringing them into Excel and color coding them. Notice that we are now using simple correlations instead of the r-squared. The x-axis tells the formation period of the valuation measure, and y-axis tells the measurement period i.e. how long into the future the valuation measure is used to predict.

The 10-year CAPE has surprisingly high explanatory power even for forecasting 1-year periods. The explanatory power starts declining noticeably from P/E8 to the left and P/E14 to the right.

For CAPD, the correlations are weaker, but the shape is about the same.

CAPB has the strongest correlations with future returns, but the shape is way different. Interestingly the strongest explanatory power regarding future returns comes from 22-24-year P/B and for a measurement period of 5-15 years.

This post was partly inspired by the O’Shaughnessy Quarterly Investor Letter Q4 2018.

Be sure to follow me on Twitter for updates about new blog posts!

The R code used in the analysis can be found here.

Sunday, October 28, 2018

How quickly do stock market valuations revert back to their means?

Mean reversion is the assumption that things tend to revert back to their means in the long run. This is especially true for valuations and certain macroeconomic variables, but not so much for stock prices themselves. In this post we'll look at the mean reversion of different valuation measures by forming equal sized baskets from each valuation decile and letting the valuations change as time goes on.

This study (pdf) shows an interesting graph on page 23 about the mean reversion of the 10-year price-to-earnings ratio also known as CAPE. In this post the study will be replicated using also international CAPE, P/E and P/B. I'll replicate the results using a longer time frame of twenty years. Let's start with CAPE using Shiller data of the US stock market from years 1926 to 2008:

Click to enlarge images

Using a longer time frame over reversion becomes visible, i.e. high valuations tend to eventually lead to low valuations and vice versa. The only exception is the decile with the highest valuation, which is explained by the housing bubble after the tech bubble. The valuations seem to revert back to their means in 11-12 years.

Let's look at the mean reversion of the same metric using Barclays data from years 1982 to 2008 from 26 different countries or continents:

The mean reversion happens again in about 12 years, but the over reversion seems to disappear. This might be caused by US having different kind of bubbles and busts than the rest of the world, or because of the shorter time period. The dataset is many times larger and should give a clearer picture of the mean reversion than using only US data.

Next, we'll look at price-to-book:

It seems to take longer for the P/B to revert back to its mean, which is logical since CAPE uses historical 10-year earnings. There is however still some noticeable over reversion.

Let's look at price-to-earnings ratio next:

The P/E ratio seems to revert back to its mean a little bit quicker than the rest, in about 9-10 years. There is still some over reversion.

In summary, different valuation measures tend to revert back to their means in about ten years, and over revert after that.

Hope you enjoyed this short post. Be sure to follow me on Twitter for updates about new blog posts!

The R code used in the analysis can be found here.

Monday, August 6, 2018

Mapping the stock market using self-organizing maps

Self-organizing maps are an unsupervised learning approach for visualizing multi-dimensional data in a two-dimensional plane. They are great for clustering and finding out correlations in the data. In this post we apply self-organizing maps on historical US stock market data to find out interesting correlations and clusters. We'll use data from ShillerGoyal and BLS to calculate the historical valuations levels, interest rates, inflation rates, unemployment rates and future ten-year total real returns from years 1948 to 2008.

Click to enlarge images

You can see a clear correlation between the different valuation measures, and that low valuations have led to high returns. There's a slight negative correlation between the valuation measures and unemployment, i.e. valuations have been higher when unemployment has been lower. Charlie Bilello has a great article on the subject. There's also a positive correlation between unemployment and rates, which means that rates have typically been higher when unemployment has been higher.

Next, let's look at clusters formed using hierarchical clustering. We'll form four clusters on the same plane as used in the above analysis. Let's look at the results:

The balls inside each hexagon correspond to each month. We are currently in the green cluster, which has typically lead to low returns. Why has low unemployment, low rates and low inflation led to low returns, aren't these things good for the stock market? I see two possible causes: these conditions tend to revert back to their mean (which means worsening macroeconomical conditions), and investors tend to extrapolate past returns into the future (a great tweet on the subject by Michael Batnick). The second part causes high valuations, which is present in the green cluster.

Which cluster is the best place to be in? I'd say the gray one, but the data seems to support the blue one as well. The good thing is that there are other countries that are in both of these clusters. Even though I recommend looking at valuations alone rather than macroeconomic indicators, a good place worth checking for all that macro stuff is

The R code used in the analysis is available here.

Sunday, July 22, 2018

How likely is a stock market crash?

In this post we'll look at the odds of a stock market crash from the view point of valuation. We'll use my favorite valuation measure Shiller P/E or CAPE ratio, which is just like regular P/E except it's calculated by using earnings of the last ten years instead of just one year.

According to, the CAPE ratio is currently at 32.57, which is in the 97th percentile when compared to history. We'll perform logistic regressions to calculate the probability of a correction (which is defined to be a decline of over ten percent from all-time highs) and the probability of a crash (a decline of over twenty percent). We'll use data from Robert Shiller to do the analysis. The data is from years 1881 to 2005.

The probability of a correction during the next year is a little bit higher than usual at 25 percent, as you can see at the point where the two lines intersect. Let's look at the probability of a crash next:

The probability of a crash seems to rise exponentially as the valuations rise. However the probability is less than I expected at fifteen percent.

The R code used in the analysis is available here.