Friday, October 23, 2009

Why and how do we do back-testing?

              
This posting is for you, Turtle (in your previous comment you ask about the period used for the historical data that you have and their differences and relevance).

The answer, in my opinion is surprisingly simple. Just imagine what will happen if NASA engineers build a rocket and launch straight away without any simulations and refinements were being done? Statistically of course, we can’t rule out the chances that it will be successful on the 1st launch without any prior test being done (I will come to that when I have the time) but we all know that the chances would be very small, maybe less than 0.00001%?

The same concept goes with pharmaceutical companies that design and develop drugs that would eventually become the medicine that we take. They would usually go through a few phases of clinical trials which would last for at least a few years before being approved to the market. The same also goes for companies that design and build cars and the list goes on…

Now, what makes us so special that we can immediately go live trading without any simulation or test (to prove that the strategy has edge) being doing on the strategy that we are going to use, if we actually have one and really stick to it? There are so many variables or things that could go wrong when we go live trading but the least we could do is to reduce that chances of failure on things that we have full control of such as our trading strategy and its verification.

However, having said that most beginners thought that once they have done the correct back-testing then the strategy will definitely work. Unfortunately nature just doesn’t work that way. If I were to show you now the studies that I’ve done using Monte Carlo simulations (which I will later) on trading strategies’ possible outcome when go live, and if we were to take the outliers (extreme events) into full consideration, you won’t start trading! It is almost like if you are expecting that the drugs that pharmaceutical companies produces will have to pass with zero side effects when they do their trials before being allowed to go to market. I don’t think any would be passed if they do their trials properly!

We do statistical test via sampling process because it is either impossible to get the whole population for us to study or the time and other constraints make it not worthwhile to study the whole population. The same goes with trading. It is not practical to back-test 50 years of data as not many product would have that long of a history! Even if you take all 50 or 100 years of data, it is still relatively small if you think that our financial markets are actually still very young, what if it survive another 5000 years?

However, the more data we study, the more (higher chance) the test result would better reflect the actual population, but it will never be 100%! Now, Turtle, if you have the historical data for FKLI & FCPO since 1996 and have verified that they are correct, then why would you only want to choose only 2006 onwards to back-test? As I have said in my earlier posts, when we test our trading strategy, we should actually try very hard to fail it and not the other way around. If we can’t fail it only then we would be more confident of its chances of success in the future. Therefore we would use the data that has all the different market cycles in it. A full market cycle is usually between 8-12 years at the moment. Trust me, generally the performance result of our strategy would degrade the longer the data we used.  All systems break at some point in time.

Now there is no hard and fast rule on how exactly you want to perform your statistical test, but the general rule would be to separate them into two equal period, in this case 1996-2002 and 2003-2009. The more advance commercial software would allow you to automatically step through all the possible different combinations of period and perform the test!

You can choose either period as your in-sample (personal taste) e.g. 1996-2002 which you will develop, test, optimize and re-test until you get the best compromised result. After that just treat the period 2003-2009 as if you start trading at the end of 2002 with real money (usually call out-of-sample), see what happen to the strategy that have passed the in-sample test. If it pass with good result, then chances are good that it will continue to perform well in the future (notice I didn’t say for sure). If it perform badly then you must not change the parameters or optimize your strategy on the out-of-sample anymore. It just means that either you have overly curve fitted your strategy to the in-sample data or your strategy is not robust enough to withstand the changes in the underlying structure of the market.

Remember, the only thing that is constant about the market is that it is constantly changing! I hope that answers your question about the differences and relevance of choosing different period.

Do let me know if you are still confused. It is ok, trust me, it took me a few years to fully understand the concept and apply them (without any outside guidance and help of course).
                          

2 comments: