VRP and Other Measures of Implied Volatility from Trading the Odds

This is a follow up to the posts here and here from the always excellent Trading the Odds. We’ve covered TTO’s work previously when we looked at their variation of a “VRP” strategy, comparing implied vs historical volatility to trade VIX ETPs like XIV and VXX.

In these new posts, TTO looked at other measures of implied volatility beyond just the VIX index. We put these other measures to the test here. Strategy results from 07/2004 trading XIV (inverse VIX) and VXX (long VIX) follow. Read about test assumptions, or get help following this strategy.


There are four equity curves in blue in the graph above, versus buying and holding XIV in grey. I’ve intentionally painted them all the same color (more on why in a moment). But first, the strategy rules as tested:

  • At the close, calculate the following: the 5-day average of [implied volatility – (2-day historical volatility of SPY * 100)].
  • Each of the equity curves above uses a different measure for “implied volatility”: the VIX index, the 30-day constant maturity price of VIX futures, or VXMT (mid-term VIX) (1). I’ve also added the VXV index for good measure.
  • Go long XIV at the close when the result of the above formula is greater than zero (i.e. a premium exists between implied and historical volatility), otherwise go long VXX. Hold until a change in position.

Note that our results differ significantly from TTO’s. See footnote for a discussion of why.


I painted all four equity curves blue to drive home the point that, regardless of any perceived difference, these strategies have performed so similarly that any advantage of one over the others is likely the result of random chance. I would be about equally confident in any of these strategies moving forward.

All four variations were in agreement on about 96% of days. That’s because there’s very little information contained in any one of these measures that’s not also contained in the others.

30-day futures will tend to price higher than the VIX index, VXV higher than futures, and VXMT higher than VXV, simply because they’re measuring implied volatility further out (which adds to uncertainty, which tends to increase required risk premium).

Contrary to my first thought though, that doesn’t mean that longer-dated implied volatility spends significantly more time short the VIX (ex. long XIV). I’m assuming that’s because these type of strategies tend to only take a long vol position when volatility spikes, which is also when the premium between longer and shorter-dated implied vol is compressing, meaning that when it actually counts, using shorter-dated implied vol (like the VIX) doesn’t result in significantly different results than longer-dated vol (like VXMT).

In short, one of these four variations will outperform in the future just by happenstance, but I don’t think history offers a useful enough guide as to which variation that will be. I’d be about equally confident in any of these variations in the future.

A big thank you to Trading the Odds for the thoughts and allowing us to add our two cents here.

* * *

When the strategies that we cover on our blog (including this one) signal new trades, we include an alert on the daily report sent to subscribers. This is completely unrelated to our own strategy’s signal; it just serves to add a little color to the daily report and allows subscribers to see what other quantitative strategies are saying about the market.

Click to see Volatility Made Simple’s own elegant solution to the VIX ETP puzzle.

Good Trading,
Volatility Made Simple

Wonk notes:

  1. For all dates prior to 2008, the VIX index was used in place of the VXMT index.
  2. The results of our tests are significantly worse than those presented by TTO. The biggest reason is that TTO uses the S&P 500 cash index (GSPC), while I use the non-dividend adjusted S&P 500 ETF SPY, to calculate historical volatility. I can find no empirically-sound reason why using one index in place of the other should lead to such starkly different results, so I chalk most of the performance difference up to overfitting. That isn’t intended as a gotcha, as all backtests are inherently overfit to some degree. And it isn’t to say that GSPC or SPY is better or worse than the other. It’s only to say that if I were to use GSPC here, the sole purpose would be to produce a better looking backtest, so I’m sticking with the convention that I’ve used historically on this blog.
Posted in Strategy Backtests.