Tuesday, January 1, 2013

Intraday mean reversion

In my previous post I came to a conclusion that close-to-close pairs trading is not as profitable today as it used to be before 2010. A reader pointed out that it could be that mean-reverting nature of spreads just shifted towards shorter timescales. I happen to share the same idea, so I decided to test this hypothesis.

This time only one pair is tested: 100$ SPY vs -80$ IWM. Backtest is performed on 30-second bar data from 11.2011 to 12.2012.
The rules are simple and similar to strategy I tested in the last post:
if bar return of the pair exceeds  1 on z-score, trade the next bar.
The result looks very pretty:

I would consider this to be enough proof that there is still plenty of mean-reversion on 30-second scale.
If you think that this chart is too good to be true, that is unfortunately indeed the case. No transaction costs or bid-ask spread were taken into account. In fact, I would doubt that there would be any profit left after subtracting all trading costs.
Still, this kind of charts is the carrot dangling in front of my nose, keeping me going...

Sunday, December 30, 2012

Is pairs trading dead?

Bad news everybody, according to my calculations, ( which I sincerely hope are incorrect) the classical pairs trading is dead. Some people would strongly disagree, but here is what I found:

Let's take a hypothetical strategy that works on a basket of etfs:
From these etfs 90 unique pairs can be made. Each pair is constructed as a market-neutral spread.

Strategy rules:
On each day, for each pair, calculate z-score based on 25-day standard deviation.
If z-score > threshold, go short, close next day
If z-score < -threshold go long, close next day

To keep it all simple, the calculation is done without any capital management (one can have up to 90 pairs in portfolio on each day) . Transaction costs are not taken into account either.

To put it simply,  this strategy tracks one-day mean reverting nature of market neutral spreads.
Here are the results simulated for several thresholds:

No matter what threshold is used, the strategy is highly profitable in 2008, pretty good throuh 2009 and completely worthless from early 2010.
This is not the first time I came across this change in mean-reverting behavior in etfs. No matter what I've tried, I had no luck in finding a pairs trading strategy that would work on ETFs past 2010. My conclusion is that these types of simple stat-arb models just don't cut it any more.

Pca - how it really works

I suppose that my previous post did not provide insights on how PCA really works. Here is another try at the subject, using a simple pair as an example.
Let's take SPY and IWM, which are highly correlated. If daily returns of IWM are plotted against daily returns of SPY, the relationship is highly linear (see left chart).
Applying PCA on this data gives two principal component vectors, plotted in red (first) and green (second). These two vectors are orhogonal, with the first one pointing in the direction of highest variance. Transformed data is nothing more than the original data projected on the new coordinate axis formed by these two vectors. The transformed data is shown in the right chart. As you can clearly see, all  points are still there, but the dataset is rotated.
The second vector is in this case -0.78 SPY + 0.62 IWM which produces a market-neutral spread.  Of course the same result would be achieved by using the beta of IWM.
The fun thing about PCA is that it is useful in building three- and more legged spreads. The procedure is exactly the same as above, but the transformation is done in a higer dimensional space. 

Monday, December 3, 2012

Using PCA for spread trading

Classical pairs trading usually involves building a pair consisting of two legs, which ideally should be market-neutral or in other words, pair returns should have zero correlation with market returns. The process of building a 'good' pair is pretty standard. A typical way of building a pair (spread) involve choosing two correlated securities and forming a market-neutral pair using stock betas.

Multi-legged spreads are more advanced and very difficult to build using the traditional method.
However, there is a mathematical method called Principal Component Analysis that can be easily used to create stable (=tradeable?) spreads. All the linear algebra is luckily hidden inside the princomp function, but if you'd like to understand how PCA really works, take a look at this tutorial. The transformed data can be described as : 1-st component: 'max volatility portfolio', which is usually very highly correlated with the market. 2-nd component: 'market-neutral' portfolio, having maximum variance. 3-d and further components have decreasing degrees of variance. Note that by design, PCA produces orthogonal components, meaning that all portfolios are not correlated to each other. So 2nd and further portfolios are market-neutral.

Here is an example of applying PCA on some correlated etfs in the energy sector:
The upper chart shows raw prices, the lower char are the cumulative returns of principal components. To compute the principal components I only used first 250 days of data. It seems that the principal components, which are linear combinations of each security returns are quite stable out-of-sample, which is a pleasant surprise. First (blue) component has most of the variance, and it is clearly correlated to the movement of the prices in the upper chart.

Let's take a closer look at the last two components: these seem to be quite stable and tradeable even far out-of-sample.

Thursday, September 27, 2012

Gap strategy with intraday data

The gap fading strategy from previous posts looked all right, but my worry is that Yahoo data does not provide accurate quotes. To check the strategy performance, I've generated a new OHLC dataset based on the Weighted Average Price (wap) of 30-second intraday data. So the opening quote is the wap of first 30 seconds of trading and close is the last 30-second wap. To make sure that my dataset is correct, I have compared it to the yahoo quotes. As shown in the chart below, the difference between the two quotes is ~5ct which seems very reasonable.
Now, testing the gap fade strategy on the OHLC data that I generated myself produces much less favorable result:
One look at the pnl chart is enough to say that this strategy would be rubbish.
This brings me to a conclusion that I already was aware of: Yahoo opening quotes are not suitable for strategy backtesting.

Thursday, September 20, 2012

Gap strategy revisited

In the beginning of 2011 I've backtested a fade gap strategy. There seemed to be an edge to fading gaps, so let's take a look how this strategy performed since then. Once again strategy rules:
  • Trade only gaps larger than 0.1 %
  • Enter on the open (short for Up gap and long for Down gap). Profit target is set at previous day close.
  • If profit target was not reached during the day, exit on close
This time I corrected the data for dividends.

The results out-of-sample are pretty good, the strategy was doing well in 2011-2012.
A more realistic case is including transaction cost of about 0.03% , which is approximately 3ct for SPY. 1ct is IB commission, another two are needed for crossing the bid-ask spread.
The Sharpe ratio for these strategies is still not solid enough for me to actually put my money on it.

buyAndHold      0.189366
fadeUpGaps      0.508378
fadeDownGaps    0.595578
fadeAllGaps     0.783124

... and I still keep wondering, how can there be an edge while there seems to be no significant correlation between the night gap and the day session change.

Wednesday, September 19, 2012

SPY opening gaps

There are quite some people in the blogsphere claiming that gap trading is statistically profitable. Just google for 'opening gaps' or something similar to get a bunch of links. Some claims are quite interesting stating >70%  chance of  a gap closing after a 'gap up'. Well, I imagine that it is possible to have a 70% 'closed gap' statistics and still have zero edge in trading the gap. I have looked into this topic about a year ago and first results were promising.
This time I took a look at the matter in a slightly different way: looking for correlation between overnight change of the SPY (previousClose-to-open) and the daily change (open-to-close) of the following trading session.
Below is a chart of cumulative daily percentage changes  of SPY for about 3 years of data. Blue line is the overnight change and green line the day session change. It is immediately clear that day session is more volatile then night session. . Apart from that nothing really special in this  chart.
More insight comes from plotting the overnight return vs daily return :
Judging by eye, there is no relation between nightly change and the daily session. Testing for correlation between the two gives : 0.000062 ... yes, zero.  I'm not sure while my previous attempts at building a gap strategy produced positive results, but now I'm determined to get to the bottom of this...