Monday, March 16, 2020

A look at past bear markets and implications for the future

The S&P 500 is officially in a bear market, and the crash from the high valuation levels has been fast and painful. There is however light at the end of the tunnel. In this post I'll demonstrate how the US stock market has developed during past bear markets and how the market has recovered during the ten years after the peak.

The reason for choosing the ten years as the horizon is because I believe that you should not invest in stocks any money that you are going to need in the next ten years. The chance of having positive returns increases substantially with time and is almost ninety percent for a period of ten years. The worst annual return for a ten-year period has been about negative four percent since 1928 (sources).

We'll use monthly total return data of S&P 500 from Shiller beginning from the year 1871 until the very end of last year. The index has been reconstructed to represent the US stock market for dates the S&P 500 didn't exist yet. The reason why we go so far back in time is to include as many bear markets as possible. Panics and manias have always existed, and the human nature has not changed enough in the past 150 years to make the past data less valid. There has however been a substantial change in the spread of information, which causes panic to spread faster and may possibly make bear markets shorter and deeper.

First, let's take a look at the 14 bear markets found in the data in nominal terms, which describes how a portfolio would have developed without taking inflation into account. The horizontal black line indicates the drop needed to reach a bear market at minus 20 percent, and a blue color indicates that the return has been positive in the 10 years following the peak i.e. the ending value is higher than the value at the peak, and a red color indicates the opposite.

Click to enlarge images

Only two of the fourteen bear markets did not recover in ten years from the initial peak. Not surprisingly, the two bear markets were the ones that peaked at bubble territory in 1929 and 2000. Notice that the bear markets that peaked in 1919 and 1987 we followed by the exact same bubbles.

Below is the same plot with real returns, so the returns describe the actual change of purchasing power by taking inflation into account. Notice that since bear markets are defined as being down by twenty percent in nominal terms, the returns might not dip below the black line because of deflation.

In real terms, four of the fourteen bear markets did not recover after ten years of peaking. Judging by the history, this still leaves us an over 70 percent chance of the index being higher in the next ten years after inflation. Note that the bear market that peaked in 1968 is overlapping heavily with the bear market that peaked in 1972, so they could be considered to be the same bear market, which would increase our chances even further.

Let's then plot the bear markets in red on top of the index to get a sense of the lengths of the bear markets, from peak to full recovery.

The average length of a bear market from peak until recovery has been 3.95 years and the fall length from the peak until bottom i.e. a peak to trough time was 1.45 years. The longest bear market during the 1930s Great Depression was 15.33 years, and the longest time the stock market fell was 2.75 years.

Lastly, let's take a look at just the drawdowns. The bear market threshold is again indicated with a black horizontal line. The monthly data is only until the end of the year 2019, so the recent drawdown of early 2020 is missing from the graph. At the time of writing, the index is down 27 percent, with only seven of the historical drawdowns being as severe as this one.

The average drop in a bear market using monthly data has been 33.9 percent, with a maximum of 81.8 percent during the 1930s. Notice again that these are total returns. The drawdowns have been worse during periods with high valuations, as measured by Shiller CAPE or P/B. The maximum drawdowns seem to have also increased with time, which may be caused by lower valuations at the beginning of the time frame and possibly also because people have been more connected than ever, which makes the spread of panic easier.

To conclude, this bear market has been rough and short this far. However, judging by the history, most bear markets recover fully in ten years. The valuations that are still elevated compared to history may however make the index to not to recover as much as in past bear markets.

Be sure to follow me on Twitter for updates about new blog posts like this!

The R code used in the analysis can be found here.

Tuesday, December 31, 2019

Predicting the next decade in the stock market

Making accurate predictions using the vast amount of data produced by the stock markets and the economy itself is difficult. In this post we will examine the performance of five different machine learning models and predict the future ten-year returns for the S&P 500 using state of the art libraries such as caret, xgboostExplainer and patchwork. We will use data from Shiller, Goyal and BLS. The training data is between the years 1948 and 1991, and the test data set is from 1991 and only until 2009, because the target variable is lagged by ten years.

Different investing strategies tend to work at different times, and you should expect the accuracy of the model you are using to move in cycles; sometimes the connection with returns is very strong, and sometimes very weak. Value investing strategies are a great example of a strategy that has not really worked for the past twelve years (source, pdf). Spurious correlations are another cause of trouble, since for example two stocks might move in tandem by just random chance. This highlights the need for some manual feature selection of intuitive features.

We will use eight different predictors; P/E, P/D, P/B, the CAPE ratio, total return CAPE, inflation, unemployment rate and the 10-year US government bond rate. All five of the valuation measures are calculated for the entire S&P 500. Let's start by inspecting the correlation clusters of the different predictors and the future ten-year return (with dividends), which is used as the target.

The different valuation measures are strongly correlated with each other as expected. All expect P/B have a very strong negative correlation with the future 10-year returns. CAPE and total return CAPE, which is a new measure that considers also reinvested dividends, are very strongly correlated with each other. Total return CAPE is also slightly less correlated with the future ten-year return than the normal CAPE.

The machine learning models

First, we will create a naïve model which predicts the future return to be same as the average return in the training set. After training the five models we will also make one ensemble model of them to see if it can reach a higher accuracy as any of the five models, which is usually the case.

The models we are going to use are quite different from each other. The glmnet model is just like the linear model, except it shrinks the coefficients according to a penalty to avoid overfitting. It therefore has a very low flexibility and also performs automated feature selection (except if the alpha hyperparameter is exactly zero as in ridge regression). K-nearest-neighbors makes its predictions by comparing the observation to similar observations. MARS on the other hand takes into account nonlinearities in the data, and also considers interactions between the features. XGBoost is a tree model, which also takes into account both nonlinearities and interactions. It however improves each tree by building it based on the residuals of the previous tree (boosting), which may lead to better accuracies. Both MARS and SVM (support vector machines) are really flexible and therefore may overfit quite easily, especially if the data size is small enough. The XGBoost model is also quite flexible but does not overfit easily since it performs regularization and pruning.

Finally, we have the ensemble model which simply gives the mean of the predictions of all the models. Ensemble models are a quite popular strategy in machine learning competitions to reach accuracies beyond the accuracy of any single model.

The models will be built using the caret wrapper, and the optimal hyperparameters are chosen using time slicing, which is a cross validation technique that is suitable for time series. We will use five timeslices to capture as many periods while having enough observations in each of them. We will do the cross validation on training data consists of 70 percent of the data, while keeping the remaining 30 percent as a test set. The results are shown below:


Click to enlarge images

The predictions are less accurate after the red line, which separates the training and test sets. The model has not seen the data on the right side of the line, so its accuracy can be thought as a proxy for how well the model would perform in the future.

We will examine the model accuracies on the test set by using two measures; mean absolute error (MAE) and R-squared (R²). The results are shown in the table below:

Model MAE
Naive model 5,16 % -
Ensemble 2,15 % 48,2 %
GLMNET 3,00 % 29,7 %
KNN 3,37 % 10,6 %
MARS 10,70 % 90,2 %
SVM 10,80 % 13,1 %
XGBoost 2,17 % 60,1 %

The two most flexible models, MARS and SVM, behave wildly on the test set and show signs of overfitting. Both of them have mean absolute errors that are about twice as high when compared to the naïve model. Even though MARS has a high R-squared, the mean absolute error is high. This is why you cannot trust R-squared alone. Glmnet has quite plausible predictions until the year 2009, most likely because of the rapid growth of the P/E ratio. K-nearest-neighbors has not reacted to the data too much but still achieves a quite low MAE. Out of the single models, the XGBoost has performed the best. The ensemble model however has performed slightly better as measured by the MAE. It also seems to be the most stable model, which is expected since it combines the predictions of the other models.
Let's then look at the feature importances. They are calculated in different ways for the different model types but should still be somewhat comparable. The plotting is done using the library patchwork, which allows plotting to be done by just adding the plots together using a plus sign.

Upon closer inspection of the feature importances, we see that the MARS model uses just the CAPE ratio as a feature, while rest of the models use the features more evenly. Most of the models perform some sort of feature selection, which can also be seen from the plot.

Future predictions

Lastly, we will predict the next ten years in the stock market and compare the predictions of the different models. We will also look closer at the best performing single model, XGBoost, by inspecting the composition of the prediction. The current values of the features are mostly obtained from the sources listed in the first chapter, but also from Trading Economics and multpl.

Model 10-year CAGR prediction
Ensemble 2,20%
GLMNET 1,47 %
KNN 4,04%
MARS -9,85%
SVM 6,46%
XGBoost 8,86%

The MARS model is the most pessimistic, with a return prediction that is quite strongly negative. The model should however not be trusted too much since it uses only one variable and does not behave well on the test data. The XGBoost model is surprisingly optimistic, with a prediction of almost nine percent per year. The prediction of the ensemble model is quite low but would be three percentage points higher without the MARS model.

Let's then look at the XGBoost model more closely by using the xgboostExplainer library. The resulting plot is a waterfall chart which shows the composition of a single prediction, in this case the predicted CAGR (plus one) for the next ten years. The high CAPE ratio reduces the predicted CAGR by seven percentage points, but the P/B ratio increases it by six percentage points. This is because the model contains interactions between the CAPE and P/B ratios. The effect of the interest rate level is just a bit positive at two percentage points, but the currently high P/E ratio reduces it back to the same level. The rest of the features have a very small effect on the prediction.

The benefit of predicting the returns of a single stock market is mostly limited to the fact that you can adjust your expectations for the future. However, predicting the returns of multiple stock markets and investing in the ones with the highest return predictions is most likely a very profitable strategy. Klement (2012) has shown that the CAPE ratio alone does a quite good job at predicting the returns of different stock markets. Adding more variables that are sensible to the model is likely to make the model more stable and perhaps better at predicting the outcome.

Be sure to follow me on Twitter for updates about new blog posts like this!

The R code used in the analysis can be found here.

Wednesday, July 17, 2019

Combining momentum and value into a simple strategy to achieve higher returns

In this post I'll introduce a simple investing strategy that is well diversified and has been shown to work across different markets. In short, buying cheap and uptrending stocks has historically led to notably higher returns. The strategy is a combination of these two different investment styles, value and momentum. In a previous post I explained how the range of possible outcomes in investing into a single market is excessively high. Therefore, global diversification is the key to assure that you achieve your investment objective. This strategy is diversified across strategies, markets and different stocks. The benefits of this strategy are the low implementation costs, a high diversification level, higher expected returns and lower drawdowns.

We'll use data from Barclays for the CAPEs which represent valuations, and Yahoo Finance using quantmod for the returns that do not include dividends, which we'll use as absolute momentum. Let's take a look at the paths of valuation and momentum for the U.S. stock market for the last seven years:

The two corrections are easy to spot, because momentum was low, and valuations decreased. The U.S. stock market currently has a strong momentum as measured by six-month absolute return, but the valuation level is really high. Therefore the U.S. is not the optimal country to invest in. So, which market is the optimal place to be? Let's look at just the current values of different markets:

There is only one market that is just in the right spot: Russia. It has the highest momentum and second lowest valuation of all the countries in this sample. In emerging markets things happen faster and more intensively, which leads to more opportunities and makes investing in them more interesting. Different markets also tend to be in different cycles, which makes this combination strategy even more attractive. Let's discuss more about these strategies and why they work well together.

Research on the topic

Value and momentum factors are negatively correlated, which means that when the other one has low returns, the other one's returns tend to be higher. Both have been found to lead to excess returns and are two of the most researched so-called anomalies. Both strategies have been tried to be explained using risk-based and behavioral factors, but no single explanation has been agreed on for either of the strategies. The fact that there are multiple explanations for the superior performance can rather be viewed as a good thing for the strategies.

In their book "Your Complete Guide to Factor-Based Investing", Berkin and Swedroe found out that the yearly returns of the two anomalies using a long-short strategy was 4.8 percent for value and 9.6 percent for the momentum anomaly. This corresponds to the return of the factor itself and can directly be compared to the market beta factor, which has had a historical annual return of 8.3 percent during the same period. This means that investing just in the momentum factor and therefore hedging against the market would have led to a higher return than just investing in the market. It is important to notice that investing normally just using a momentum strategy without shorting gives exposure to both of the market beta and momentum factors, which leads to a higher return than investing just into either of these factors.

Andreu et al. examined momentum on the country level and found out that the return of the momentum factor has been about 6 percent per annum for a holding period of six months. For a holding period of twelve months, the return was cut in half (source). It seems that a short holding period seems to work best for this momentum strategy. They researched investing in a single country and three countries at a time and shorting the same amount of countries at a time. The smaller amount of countries led to higher returns, but no risk measures were presented in the study. As a short-term strategy I'd suggest equal weighting some of the countries with high momentum and low valuation. I've also tested the combination of value and momentum in the U.S. stock market, and it seems that momentum does not affect the returns at all on longer periods of time.

Value on the other hand tends to correlate strongly with future returns only on much longer periods, and on shorter periods the correlation is close to zero as I demonstrated in a previous post. However, the short-term CAGR of the value strategy on the country level in the U.S. has still been rather impressive at 14.5 percent for a CAPE ratio of 5 to 10, as shown by Faber (source, figure 3A). I chose to show this specific valuation level, since currently countries such as Turkey and Russia are trading at these valuation levels (source).

The 10-year cyclically adjusted price to earnings ratio that was discussed in the previous chapter, also known as CAPE, has been shown to be among the best variables for explaining the future returns of the stock market. It has a logarithmic relationship with future 10-15 year returns, and an r-squared as high as 0.49 across 17 country-level indices (source, page 11). A lower CAPE has also lead to smaller maximum and average drawdowns (source).

Faber has shown that investing in countries with a low CAPE has returned 14 percent annually since 1993, and the risk-adjusted return has also been really good (source). The strategy, and value investing as a whole, has however underperformed for the last ten years or so (source). This is good news if you believe in mean reversion in the stock market.

The two strategies work well together on the stock level, as shown by Keimling (source). According to the study, the quintile with highest momentum has led to a yearly excess return of 2.7 percent, and the one with the lowest valuation has led to a yearly excess return of 3 percent globally. Choosing stocks with highest momentum and lowest valuations has over doubled the excess return to 7.6 percent. O'Shaughnessy has shown that the absolute return for a quintile with the highest momentum was 11.6 percent, and 11.8 percent for value. Combining the two lead to a return of 18.5 percent (source).

Lastly, let's take a closer look at some selected countries and their paths:

As expected, the returns of the emerging markets vary a lot compared to U.S. market. U.S. has performed extremely well, but the historical earnings haven't kept up with the prices. Israel on the other hand has gotten cheaper while the momentum has been good. Even though the momentum of U.S. is higher than any other point in time in this sample, Russia's momentum currently is, and Turkey's momentum has been way higher. Both Russia's and Turkey's valuations are less than a third of U.S. valuations, which makes these markets very interesting.

In conclusion, combining value and momentum investing into a medium-term strategy is likely to lead to excess returns as shown by previous research. The strategy can be easily implemented using country-specific exchange traded funds, and the data is easily available. Currently only Russia is in the sweet spot for this strategy, and Turkey might be once it gains some momentum. Investing to just one country is however risky, and I suggest diversifying between the markets with high momentum and low valuations.

Be sure to follow me on Twitter for updates about new blog posts!

The R code used in the analysis can be found here.

Saturday, June 8, 2019

The most important chart for long-term investors

Time is the investor's best friend. The longer the investment horizon, the less the investment returns depend on factors such as crashes and current valuation levels. It is known that the chance for losing in the stock market on a 20-year period has historically been about zero. This post attempts to expand on this fact and take a look at how risky the U.S. stock market has actually been for long-term investors.

As usual, we'll use data from Robert Shiller to answer the questions. The data begins from the year 1871, long before the actual S&P 500 index was created. We'll only consider lump-sum investing, since dollar cost averaging is another story.

Let's first look at the inflation-adjusted returns for an U.S. investor, including reinvested dividends.  Keep in mind, that the U.S. stock market has been one of the best performing in the world, and future returns are likely to be lower because of high valuations and lower productivity and population growth. The upper and lower bands are the 95 percent prediction intervals, i.e. 95 percent of the time the investment return has been between these bands. The y-axis tells how many times your investment would have been multiplied. Notice that the axis is logarithmic.

This chart demonstrates how uncertain investing is. The range of outcomes is very large, but it doesn't necessarily tell the full truth. France once had a 66-year period where stocks didn't beat inflation, for Italy the longest streak was 73 years and for Austria a painful period of 97 years. This is why global diversification is important. There has however been a 5 percent chance that the investment would have increased 64-fold in the U.S. for the same period. The risk works both ways.

Let's also look at the nominal, non-inflation-adjusted returns to see how inflation eats returns:

The inflation in the U.S. has been quite high, over three percent annually. Inflation of course affects different companies in a different way, but the net effect is that lower inflation does not necessarily lead to higher inflation-adjusted returns.

Be sure to follow me on Twitter for updates about new blog posts!

The R code used in the analysis can be found here.

The original post had a problem with the calculation of dividends. The charts and code have now been updated, and the true returns were higher than in the original post. Sorry for the inconvenience.

Tuesday, January 29, 2019

Correlation analysis of cyclically adjusted valuation measures and subsequent returns

In this post we'll test three different cyclically-adjusted valuation measures: CAPE (earnings), CAPD (dividends) and CAPB (book value). CAPE is calculated like the P/E ratio, but by dividing the current real price with the last ten year's average inflation-adjusted earnings. CAPD uses dividends instead of earnings, and CAPB uses book value. We'll test the optimal measurement (i.e. forward-looking return) period and formation period for all three valuation measures by calculating correlations with the future returns.

Typically CAPE, also known as P/E10, is calculated by using a 10-year formation period. The maximum time period for both the formation period and measurement period we'll use is 30, which means that  the performance of for example P/E1-P/E30 will be tested by looking forward 1-30 years.

We'll use Shiller and Goyal data from the US, which both begin from the year 1871. We'll plot the measurement period (in years) on the x-axis and r-squared on the y axis, and we'll make a distinct line for each of the formation periods.

As you can see from the plot above, a measurement period of about ten years, or maybe a little more, has worked the best for CAPE. As expected, valuation measures don't do a good job explaining short-term returns. However, this also applies to long-term returns, which gives the lines a bell-curved shape. The lines with shorter formation periods are lower than the rest, which means that short-term valuation measures such as normal trailing twelve-month P/E also don't work as well as the long-term valuation measures.

For CAPD, the correlation turns positive at long measurement periods, which is rather unwanted. The better performance of the long-term and the worse performance of the short-term valuation measures are more apparent with CAPD.

For CAPB, longer measurement periods of about twenty years seem to work the best. The r-squared is much larger than with CAPE or CAPD. Even the worst formation periods seem to work better in explaining future returns than the CAPE with the best formation period. This is consistent with Keimling's research (pdf, page 16), which suggests that normal P/B is almost as strong in predicting future returns as CAPE. The plot above shows that the cyclically-adjusted P/B is even stronger than CAPE in predicting future returns.

The reason why the r-squared of the CAPE is lower than what is often quoted is because of the long time period of the data. As you can see from the plot below, the rolling 10-year correlation of CAPE and subsequent returns has been rising over time. 

Another way of viewing these correlations is bringing them into Excel and color coding them. Notice that we are now using simple correlations instead of the r-squared. The x-axis tells the formation period of the valuation measure, and y-axis tells the measurement period i.e. how long into the future the valuation measure is used to predict.

The 10-year CAPE has surprisingly high explanatory power even for forecasting 1-year periods. The explanatory power starts declining noticeably from P/E8 to the left and P/E14 to the right.

For CAPD, the correlations are weaker, but the shape is about the same.

CAPB has the strongest correlations with future returns, but the shape is way different. Interestingly the strongest explanatory power regarding future returns comes from 22-24-year P/B and for a measurement period of 5-15 years.

This post was partly inspired by the O’Shaughnessy Quarterly Investor Letter Q4 2018.

Be sure to follow me on Twitter for updates about new blog posts!

The R code used in the analysis can be found here.

Sunday, October 28, 2018

How quickly do stock market valuations revert back to their means?

Mean reversion is the assumption that things tend to revert back to their means in the long run. This is especially true for valuations and certain macroeconomic variables, but not so much for stock prices themselves. In this post we'll look at the mean reversion of different valuation measures by forming equal sized baskets from each valuation decile and letting the valuations change as time goes on.

This study (pdf) shows an interesting graph on page 23 about the mean reversion of the 10-year price-to-earnings ratio also known as CAPE. In this post the study will be replicated using also international CAPE, P/E and P/B. I'll replicate the results using a longer time frame of twenty years. Let's start with CAPE using Shiller data of the US stock market from years 1926 to 2008:

Click to enlarge images

Using a longer time frame over reversion becomes visible, i.e. high valuations tend to eventually lead to low valuations and vice versa. The only exception is the decile with the highest valuation, which is explained by the housing bubble after the tech bubble. The valuations seem to revert back to their means in 11-12 years.

Let's look at the mean reversion of the same metric using Barclays data from years 1982 to 2008 from 26 different countries or continents:

The mean reversion happens again in about 12 years, but the over reversion seems to disappear. This might be caused by US having different kind of bubbles and busts than the rest of the world, or because of the shorter time period. The dataset is many times larger and should give a clearer picture of the mean reversion than using only US data.

Next, we'll look at price-to-book:

It seems to take longer for the P/B to revert back to its mean, which is logical since CAPE uses historical 10-year earnings. There is however still some noticeable over reversion.

Let's look at price-to-earnings ratio next:

The P/E ratio seems to revert back to its mean a little bit quicker than the rest, in about 9-10 years. There is still some over reversion.

In summary, different valuation measures tend to revert back to their means in about ten years, and over revert after that.

Hope you enjoyed this short post. Be sure to follow me on Twitter for updates about new blog posts!

The R code used in the analysis can be found here.

Monday, August 6, 2018

Mapping the stock market using self-organizing maps

Self-organizing maps are an unsupervised learning approach for visualizing multi-dimensional data in a two-dimensional plane. They are great for clustering and finding out correlations in the data. In this post we apply self-organizing maps on historical US stock market data to find out interesting correlations and clusters. We'll use data from ShillerGoyal and BLS to calculate the historical valuations levels, interest rates, inflation rates, unemployment rates and future ten-year total real returns from years 1948 to 2008.

Click to enlarge images

You can see a clear correlation between the different valuation measures, and that low valuations have led to high returns. There's a slight negative correlation between the valuation measures and unemployment, i.e. valuations have been higher when unemployment has been lower. Charlie Bilello has a great article on the subject. There's also a positive correlation between unemployment and rates, which means that rates have typically been higher when unemployment has been higher.

Next, let's look at clusters formed using hierarchical clustering. We'll form four clusters on the same plane as used in the above analysis. Let's look at the results:

The balls inside each hexagon correspond to each month. We are currently in the green cluster, which has typically lead to low returns. Why has low unemployment, low rates and low inflation led to low returns, aren't these things good for the stock market? I see two possible causes: these conditions tend to revert back to their mean (which means worsening macroeconomical conditions), and investors tend to extrapolate past returns into the future (a great tweet on the subject by Michael Batnick). The second part causes high valuations, which is present in the green cluster.

Which cluster is the best place to be in? I'd say the gray one, but the data seems to support the blue one as well. The good thing is that there are other countries that are in both of these clusters. Even though I recommend looking at valuations alone rather than macroeconomic indicators, a good place worth checking for all that macro stuff is

The R code used in the analysis is available here.