Thursday, February 25, 2010

Using VIX for volatility correction

Market neutral strategies often rely on a relative mispricing of two instruments. One of the challenges I've been facing is how to keep this mispricing constant in time, allowing it to stretch much further before initiating a trade in times of high market volatility.
A solution I've come up with is using the VIX index as a correction measure. It seems to work much better than a moving window estimation based on the data itself.  The formula I'm using for correction is  C = (100-VIX)/100 . The spread is then multiplied by C.
Upper graph: VIX , lower graph : real and corected spreads.

Notice how the corrected spread remains stable through the end of 2008.

P.S. The spread shown is a relative mispricing in time an not an actual X/Y ratio.

Wednesday, February 17, 2010

Probability mapping (2)

While building a model based on the probability mapping from the previous post I wasn't quite satisfied with the initial results. So I took a step back to a one dimensional state space and plotted the 5 day forecast of the XLE/XOM vs bollinger %b. Zero on the x-axis corresponds with -3 sigma and 100 with +3 sigma deviation . Y-axis shows the ratio between XLE/XOM  5 days into the future and today .

The figure shows just how difficult it is to forecast the movement of the ratio. Ideally, the data should follow a band from upper left to lower right corner. Instead, it is quite hard to see any trend present. After a linear fit (red line) some trend can be found still. Note that around 50 pct_b it is exactly coin flip between increase and  decrease in XLE/XOM.

Tuesday, February 16, 2010

Probability mapping

Here is a nice idea I've got during cross-country skiing this weekend. The 'classic' way of trading pairs is defining some measure of divergence from the mean, such as z-score. Outside some threshold a buy or sell signal is triggered. This brought me to thinking about what we are actually doing here. In essence, we are using a linear classifier in a 1-d space. By optimizing the model, the classifier is trained for an optimal value. People familiar with pattern recognition know that the linear classifier is the most basic and limited tool there is.
Having quite a background in Q-learning (my masters thesis), I understand its beautiful ability to map a state space to an expected reward, elegantly and without any models. Doesn't trading in general boil down to state-reward mapping? For sure!
I have tried different implementations of reinforcement learning without much success. But this weekend I managed to combine some ideas from pattern recognition and RL to an implementation that could work.
Some people are probably wondering by now: ' Here we go with the AI bullshit again!'. I'd like to call it probability mapping. Also, probably many of the advanced quantitative traders are already using it. Its all about estimating the chances of a bet at a set of certain conditions.
First, I define the conditions, called 'feature 1' and 'feature 2'. Two features means a two dimensional feature space, nothing keeps us from making it more (or less) dimensional, but since my monitor can best visualize two dimensions I choose that number. In my case, both features are oscillators based on cumulative returns over past x days. Feature 1 uses 3 days averaging and feature 2 20 days. Any other measure could be used (RSI, Stochastics, etc).
In the figure above the ratio between XLE and XOM is plotted along with two oscillators. It is clear that a high oscillator value correlates with the subsequent drop in the ratio.
Normally you could start applying threshold conditions from here based on oscillator levels, but i want to go a couple of steps further.

So now I plot my state map for all values of feature 1 and 2 along with the corresponding future XLE/XOM ratio after 5 days. A green dot represents increase and red dot decrease of the XLE/XOM ratio.
From this map an estimation can be made how likely it is for the ratio to go up or down for each combination of the features. Let's call it 'Sharpe surface'.  I define it similar to the sharpe ratio:  mean(20 nearest neighbors)/std(20 nearest neighbors).

 Scanning the feature space gives me the nice plot as above (note that the vertical axis is flipped over compared to the previous figure).

The interpretation of the sharpe surface is very simple: expect the XLE/XOM ratio to rise in red areas and drop in the blue. This is very much in line with common sense:  low values of both features correspondent with anticipated increase of the ratio (see feature 2 <20 or feature 1 < 20).
Again, normally we would use just one of the dimensions and put thresholds somewhere around 20 and 80. But with this probability mapping we can go for cherry picking!

Any remarks about the code are very welcome.