In my last blog entry, I outlined several key performance metrics in defining the efficacy of a trading system: maximum drawdown, total net profit, profit factor, percent profitable, and average trade net profit. While it is easy to become focused on the outstanding performance of one of these metrics, it is critical to evaluate the metrics as a whole.
As an example, say we design a system that shows an average trade net profit of $200. At first glance this looks like a winning system. But what if we also look at the maximum drawdown and learn that the system would suffer incredible losses (bigger than our account size or risk tolerance). As another example, imagine we create a system that is 80% profitable. That sounds good, right? But what if the average trade net profit is a negative number? It can happen if the average loss is substantially higher than the average win. It's important to consider how each of the metrics is connected to the others.
When we optimize our systems we intend to improve the performance of the system. We need to be careful, however, not to over-optimize our systems. Over-optimizing refers to excessive curve fitting of data producing a trading plan that will be unreliable in actual trading. An over-optimized system will often create excellent results on historical data (through back-testing) but will more than likely perform poorly during any out of sample testing or live trading.
Fortunately, there are things that we can look for to help us determine if we have over-optimized a system:
- Performance metrics that are too good to be true (like profit factors in the double digits or percent profitable reading above 80%)
- Performance metrics that are based on very few trades
- The absence of correlation between in sample and out of sample trading periods
Out of Sample vs. In Sample data
One way to help evaluate a system is to set aside "out of sample" data on which to test the system after any optimizations have been completed. No optimizing should occur on the out of sample data. Ideally we will have lots of trade data to test and study when designing a system. If we can do our optimizing on about 2/3 of the data available, reserving the last third as out of sample, we can confirm the results of our optimized system on this trade data. If the system performs well during this time period (on the out of sample data) that is a good sign that the system may be viable.
Next week we will look at forward performance testing, the next step in confirming the efficacy of a system.
Comments