Every quantitative trading/investment model is different, so it’s impossible to write a step-by-step “how to build a model” post. However, there are some general mistakes that traders should avoid when building quantitative models.
Do not: use indicators that make no logical sense
This is the problem that a lot of hedge funds face. They wheel in a ton of PhD’s and develop insanely elaborate models. The models work well for a few years and then completely fail. Why?
Because their models make no sense. They’re finding mathematical patterns that make no sense. In a world with infinite data points, there will always be many patterns/indicators that exist not because of some fundamental reason, but by pure random chance.
Here’s an example.
Based on this “indicator”, accurately guessing the future weight of turkeys means that you can accurately guess the future of the U.S. stock market! In reality, these 2 things have no connection at all. Turkey weights have increased because of growth hormones being injected into these animals.
So when you’re building a quantitative model, always step back and ask yourself if Indicator XYZ makes logical sense. “Why is it that Indicator XYZ can accurately predict the future of the market? What’s the fundamental/technical reason behind this?”
Do not: Use a limited amount of data
I firmly believe that the more historical data that your model is built upon, the better. In particular, make sure that your model has BOTH bull market data AND bear market data.
If your model is completely built upon bull market data, your model will be annihilated when the next bear market comes along.
This is why I trade the S&P 500. Our Medium-Long Term Model is built on 67 years of historical data, and we have also tested the model on the Dow Jones Industrial Average (going back to 1897). This data includes massive bull markets, massive bears markets/crashes, and everything in between. That is why our model is so robust.
Models built on limited data are prone to spectacular failures. For example, Long Term Capital Management (LTCM) built models based only on data from 1990-1994 ( a rally within a bull market). They traded very profitably from 1994-1998, and then lost EVERYTHING in 1998. That’s what happens when your model is built on limited data. You don’t stress test for enough situations.
*LTCM was founded by winners of the Nobel Economics prize. Nobody is better than them at mathematical modelling.
Do not: Use the same model for all markets
There is no holy grail to trading/investing. There is no one-size-fits-all model that can consistently and profitably trade all markets. A model should be tailor-made for each market that you trade.
This is because the drivers behind every market is different. For example, the U.S. stock market is ultimately driven by
- The U.S. economy
- U.S. corporate earnings.
Meanwhile, a market such as gold/silver is impacted by inflation, money flows, and safe haven plays.
Do not: use too many indicators
This is more of a personal preference. Models that have e.g. 40+ indicators need to cut down on the number of indicators that they use. An ideal model has 15-20 indicators.
I firmly believe in “simplicity is the ultimate sophistication”. The problem with having too many indicators is that the indicators will often conflict with each other.
All you need is one indicator to reflect one idea. Let’s say you want to employ “momentum” in your model. Do you really need to use both RSI and MACD? No! These 2 indicators end up giving a similar result because they are both momentum indicators. Pick one and simplify.
Do not: weigh all indicators equally
This is a contentious point. Some people argue “if an indicator is good enough to be used in the model, it should be weighted equally to all the other indicators”. I respectfully disagree.
Sometimes 2 indicators will send opposing signals (e.g. one will be very bearish and the other will be very bullish). One signal must ultimately override the other. The indicator that’s more important must override the other.
Do not: over-fit the data
Over-fitting the data is how most models fail. You always want to give your indicators some leeway. Here’s an example.
Let’s say historically, the market rallied every time RSI fell to at least 23.
- You shouldn’t have your indicator say: “buy when RSI falls to at least 23”. That is over-fitting the data.
- You should have your model say “buy when RSI falls to at least 25” or “buy when RSI falls to at least 30”.
Give your indicators some leeway.