This is a test of another “Volatility Risk Premium” (VRP) strategy from the always excellent Trading the Odds (1). The strategy is similar to the Brute Force VRP, DDN’s VRP, and original TTO’s VRP strategies that we’ve shared previously in that it compares implied and historical volatility to predict changes in VIX ETPs like XIV and VXX.
See footnote re: the difference between my results and those produced by TTO (2).
- At the close, calculate the following: the 5-day exponential moving average of [30-day constant maturity price of VIX futures – (2-day historical volatility of SPY * 100)].
- Go long XIV at the close when the result of the above formula is greater than 1, otherwise go long VXX. Hold until a change in position.
Note the differences between this strategy and other “VRP” strategies we’ve tested: (1) this strategy uses the 30-day constant maturity price of VIX futures (as opposed to the VIX) as a measure of implied volatility, and (2) it smooths the signal with a more responsive exponential (as opposed to simple) moving average.
While this strategy would have performed better historically than any of the other VRP variations we’ve tested, I remain more or less equally confident in its ability to perform in the future out-of-sample.
Call it the cynic in me, born out of years of deploying strategies here in the real world, but I am much more interested in concepts than specific parameter selection – the concept here being comparing implied and historical volatility.
You’ll find that the fortunes of these concepts tend to rise or fall together. This month is a good example, with this entire class of strategies struggling as a result of the dichotomy discussed here.
Why does this concept work? Because historically, when implied vol as fallen too far below historical vol, it has often meant implied vol is underestimating future realized vol, which over time will put pressure on the VIX and VIX futures to rise, and ETPs like XIV and ZIV to fall.
One final note…
In the results above, I’ve compared the strategy as originally tested to trading XIV-only (and moving to cash instead of VXX). Notice the significant decline in performance, especially in terms of risk-adjusted performance (Sharpe and UPI). Note too how the strategy would have spent a mere 11% of all days long VXX (and most of those days were bunched together in brief periods such as 2007/08).
When such a small percentage of the total sample contributes such a large percentage of performance, it exponentially increases the risk of overfitting (which leads to failure to perform out-of-sample).
That’s true not just for this strategy, that’s true for all of these long/short volatility strategies (including ours) that are heavily biased towards the inverse VIX play, but rely on major VIX pops to boost historical returns. Whether these strategies will be able to so deftly capitalize on those major VIX pops in the future is suspect.
* * *
A big thank you to Trading the Odds for the thoughts and allowing us to add our two cents here.
When the strategies that we cover on our blog (including this one) signal new trades, we include an alert on the daily report sent to subscribers. This is completely unrelated to our own strategy’s signal; it just serves to add a little color to the daily report and allows subscribers to see what other quantitative strategies are saying about the market.
Click to see Volatility Made Simple’s own elegant solution to the VIX ETP puzzle.
Volatility Made Simple
- These type of strategies, comparing implied and historical volatility, have become known as “volatility risk premium” or VRP strategies. In truth, there are multiple VRPs in the VIX complex (VIX spot vs realized volatility, VIX futures vs realized VIX, etc.), so admittedly, the term is probably not the most accurate. In any case, we stick with that convention here.
- The results of our tests are worse than those presented by TTO. The biggest reason is that TTO uses the S&P 500 cash index (GSPC), while I use the non-dividend adjusted S&P 500 ETF SPY, to calculate historical volatility. I can find no empirically-sound reason why using one index in place of the other should lead to such starkly different results, so I chalk most of the performance difference up to overfitting. That isn’t intended as a gotcha, as all backtests are inherently overfit to some degree. And it isn’t to say that GSPC or SPY is better or worse than the other. It’s only to say that if I were to use GSPC here, the sole purpose would be to produce a better looking backtest, so I’m sticking with the convention that I’ve used historically on this blog.