Statistical Tests I Run on Each Portfolio

When I first started back testing, I thought all I needed to do is find strategies that did well in the past, and then I’d all be set to go moving forward.

While that idea is largely true and probably a much smarter basis for trading than what I’ve seen so many other people do, I learned that there are still measures that can be taken to ensure the odds are way, way better that the strategies will keep performing well.  Ultimately, the more rigorous your criteria are for approving a portfolio of strategies, the better the odds of success.

The good news for you is that you don’t need to be a statistician!  I’ve already done the statistical work and you’re welcome to enjoy the benefit of that work with me if you want.

Here are the criteria I use for all my portfolios:

  • It must have great results over a back test of at least 20 years. The reason:  we have seen a lot of different types of markets over that period.  Straight up, straight down, sideways, and just about everything in between.  I only consider strategies that can hold up through all those environments.

  • I do a “Monte Carlo Test”. We may know that the strategy did well over the last 20 years, but what if the order of wins and losses was different?  The Monte Carlo test I run does 10,000 different random variations of ordering of the transaction profit results, and then outputs the results.  This allows me to see, from 10,000 variations, how often if at all there could have been a catastrophic problem with the portfolio, and also what amount of capital would be required for the account for optimal risk/reward.  My strategies have to satisfactorily pass this statistical test in order to be approved.

  • I do statistical correlation tests. When you put a lot of strategies together into a portfolio, they may all be strategies that have performed well over 20 years, but they might be highly correlated.  This can pose a major problem because if the market doesn’t go your way and suddenly you have an army of transactions all losing at the same time, it can decimate your portfolio.  So I run statistical correlation tests on my portfolios.  I compare the correlation between each pair of strategies.  I’ve weeded out a lot of good strategies with this test, which is painful for me, but which gives the portfolios much better diversity and safety.  It means that typically there are different transactions firing at different times and that there are plays that work in a variety of different market conditions.  The correlation test helps ensure a nice diverse portfolio.

  • I do out-of-sample tests. This means that instead of taking one historic sample and finding the perfect transaction entries and exits just for that period, I instead start with a smaller period and then separately run the test on one or more separate periods.  If the results are still strong in the separate period(s), then we’re cooking.

  • I do standard deviation tests on the live portfolio. Once I have the strategies for a portfolio ready to go, I trade them live and monitor the results and compare them to their 20-year standard deviation lines.  As long as it’s within 2 standard deviations, then it passes the test and is performing as expected.

  • I make sure commissions and slippage are accounted for in my back tests. Slippage is when you enter an order to exit at, say, 2500, but your actual executed price is, say, 2500.25.  That difference is slippage.  In the case of the ES, I assumed a per transaction cost of $12.50 (so $25.00 for one round trip trade).  Based on my research and experience, that satisfactorily accounts for slippage and trading commissions.  In our case, our trades are all swing trades (very roughly about 50 trades per year), so these costs aren’t going to be a major factor anyway.  But they’re accounted for regardless.

  • I make sure my strategy entries are based on the SPX price, not the ES. The reason is because the ES has expirations and multiple contracts open at any given time.  Every 3 months the current ES contract expires, and a new one becomes the “front-leading” contract.  There is a price difference between the front-leading contract and the next one behind it.  Because of this, every time there is a “rollover”, there is a price gap.  This causes problems when back testing because there are sudden leaps in value that are “phantom” leaps and are just based on the rollover.  They aren’t true changes in value.  If this sounds confusing, don’t worry!  I’m just making it clear that I’ve accounted for that issue by always tying my entries to the SPX price, which does not have expirations or rollovers.  That way I know there aren’t any phantom gains or losses in the results from the rollovers.  Also, my strategies don’t use percentages for entries or exits for the same reason.

  • I check the Sharpe ratio of my portfolio over time. This is the ratio that indicates how consistent the returns are by comparing the annual rate of return to the standard deviation of those rates of returns.  The smoother the distribution the better.  An example:  what if there were a portfolio that would have made you a millionaire over 20 years, but it had 19 years where it lost a little money and one year where it broke loose for a 10,000% gain.  That’s not the type of portfolio I’m interested in personally because who knows if there will be another “break loose” year and also because I want to see gains along the way.  I want one that consistently puts up the numbers.  It’s not a simple thing to accomplish, especially when trying to achieve big returns.  But I’ve taken a lot of pleasure in the challenge of accomplishing it.

It took a number of years for me to put this all together and to realize the benefit of putting my portfolios through this army of statistical tests.  The end results is a set of really shiny, polished portfolios that have survived all this rigorous statistical scrutiny and that I now happily put my money into.