How I Used Data Science to Make Money in Sports Betting

A few years ago, I stumbled across a bunch of guys talking about sports betting on Discord. They shared screenshots of some 4 figures wins very frequently. That got me wondering if making money in sports betting is legit. Seeing them makes money so easy with sports betting, I decided that I want to learn this game as well.

I bought “The Everything Guide to Sports Betting” by Josh Appelbaum as my 1st book on sports betting. It was a solid introduction book that gave me some basic understanding of this game. After that, I needed to figure out which sport to bet on. As a kid, I was never into sports which is why I never truly understand the rules of the vast majority of sports. That is when I decided to go with Mixed Martial Arts (MMA) since it seems to be very popular, and fighting can be fairly easy to understand.

From Josh’s book, I learned that if I bet based on my intuition like most people, I’ll lose like most of them. Due to that, I knew that I need to find my own edge. Something that is proven and quantifiable.

I started to look through the historical fights data way back in 2010 (nearly 5000 fights). Then, I dissected the data left and right trying to see if anything stands out. Eventually, I found that there are some factors that help to predict the outcome of a fight. Some of these factors or variables are Age, Height, Significant Strike, Accuracy, Take Down, Current Streak, Number of Days Since Last Fight. The last one is basically how active a fighter has been. If he or she has not been fighting for over a year or two, chances are they’re more likely to lose in their upcoming fight.

From there, I built a decision tree model using these variables to predict the outcome of a fight. To avoid overfitting, I kept my model very simple. There were only a handful of variables. I also did a walk-forward testing for a 2-month period. In addition, I tested different parameters for my variables to ensure the robustness of the model.

Here are some statistics for this model assuming that I bet $100 per fight.

Notice that the win rate is very high. A whooping 66.44%! This is because the model bets a lot on the favorites which usually means the fighter that is ranked higher than their opponent. In the UFC, the fighter that is ranked higher than their opponent wins about 58% of the time. Nevertheless, if you always bet on the favorites, you’re guaranteed to lose money. The reason is that when you bet on the favorites, your payout is a lot worse than betting on the underdogs. For example, if you bet $100 on the favorite, you can expect to win a lot less than $100. Sometimes, it can be as bad as $20 if it’s a huge favorite. If you bet $100 on the underdog instead, you can generally expect to win a lot more than $100. Sometimes, it can be as good as 5 times your betting amount ($500).

Anyway, another thing to point out is that the average win for my model is $69.43, while the average loss is $100. So, my wins are smaller than my losses. Thanks to a high win rate, this is not a problem though. The expected value (EV) for my model is $9.13. This means that on average, every time I bet $100, I can expect to win $9.13 or nearly 1/10 of the betting amount.

On 11/03/2021, I bought $1000 of Bitcoin to deposit to BetOnline. With transaction fees and how Bitcoin’s price fluctuates, by the time my $1000 arrived at BetOnline, it was only $962.12 left.

I used their promo code BOL1000 to gets a 50% bonus for an addition $481.06. These bonuses are not free money. The catch is that I have a 10x rollover meaning that I have to bet around $14,431.80 before I can withdraw any money. Bookmakers love to do this because most people lose their entire deposits before they can clear this rollover.

For this experience, I’ve set aside $2000. if I lose it all, then I’ll stop sports betting. So far, I only deposited $1000. With the bonus money, I had roughly $1500 on BetOnline. In total, my bankroll was about $2500.

In general, sizing is very important. If I size too small, my account will never grow. However, if I size too large, there is a risk of ruin even with a positive EV strategy. As I have accepted the risk of losing all this money, I decided to bet $250 per fight which is 1/10 of my bankroll. In the scenario, where I lose 10 fights in a row or have a drawdown that wipe out my bankroll, I would have stopped sports betting.

On November 2021, I placed my 1st real money bet. Today is 7/16/2022, I completed my rollover after 255 days and 94 bets. Here is my result:

Notice that when I lose, I always lose the fixed amount of $250. However, when I win, most of my wins are much smaller than the risk amount ($250). As mentioned earlier, this is because I bet mostly on favorites. Anyway, below are the statistics of my live betting result not including the bonus money:

The live result is very similar to the projected result of my model. The win rate is high, the expected value is roughly 1/10 of the betting amount, and the average win is less than the average loss. This is really good because it is showing that the model was able to predict the outcome of the fights accurately enough to generate a net profit of $2,282.63 dollars during this period.

Here is the final balance including the bonus money:

Overall, this experience has turned out way better than I’ve imagined. Nevertheless, my sample is small (only 94 bets), so I am still skeptical of this result. It seems too good to be true… Perhaps, this was just all luck, and I was simply fooled by randomness… (Nassim Taleb). Nevertheless, I’ll continue to follow this model for a while, and we’ll see how things go. May luck be by my side and happy sports betting!

Disclaimer: I am not a financial advisor. All information in this blog is for educational purposes only.

3 Consecutive Down Days on SPY

What is the 3 consecutive down days pattern? Just as the name describes, it’s simply a pattern in which we have 3 down days in a row. Each day, the close is lower than the previous day’s close. In other words, the trend for the last 3 days is down, and the sellers are probably in control for the short term. Below is an example of this pattern.

This pattern occurs somewhat frequently, and in all markets. Nevertheless, different markets have different tendencies. For markets that are more mean reverse, seeing this pattern could mean a buying opportunity. For markets that are more momentum driven, this could be a short selling opportunity.

It is a well-known fact that indexes are highly mean reversion market, especially the S&P 500 since it’s the most diversified index. As a result, after the S&P 500 makes a big move in one direction, it usually means reverse back to the average briefly. In addition, indexes have a strong tendency to go up over the long run, which can definitely help us with creating a strategy that only trade on the long side.

To find out if there is any edge to the 3 Consecutive Down Days pattern, I did a quantitative test to see what happened if I bought the close after 3 consecutive down days, then sell in X days after. The data is for ticker SPY from 1/1/2000 to 6/21/2022. Here are the results:

Notice that our exit signal is only for 3, 5, 10, 15, 20 days after which should give us enough varieties, and avoid overfitting. From this table, we can see that the win rate is around 60%, the average return is from 0.32% to 0.69%, and the median return is from 0.51% to 1.62%. All exits provided positive results. We can also see that the longer we hold, the returns seem to be better.

Now, let’s see what an equity curve for trading a strategy like this looks like.

For a simple exit, this equity curve is not bad! Of course, it’s not beating the buy and hold method. However, keep in mind that the draw down is a lot less and the percent of time in the market is also significant lower.

It’s clear that there is an edge for buying SPY after 3 consecutive down days. With a better exit & perhaps a profit target (hint), this strategy can be improved much further. However, that is beyond the scope of this blog. My goal is only to share ideas that have an edge, you will have to do your own research to improve upon these ideas.

Disclaimer: I am not a financial advisor. All information in this blog is for educational purposes only. 

Directional Bias Going into the Day When SPY Gaps Up 0.5% to 2%

In trading, gaps happen almost every day. Gaps indicate that something has changed overnight. The bigger the gap, the more significant it is. Some gaps allow us to have an educated guess on the direction of the day. In today’s blog, let’s look at what kind of information we can find from gaps.

Below is a table breaking down the gaps in SPY from -4% to 4% in 0.5% increment. The data is from January 2000 to February 2022. Notice that we have the number of occurrences, the average & median % change from the Open to Close, the total % change, and the % of times where we closed higher.

There are a few key takeaways from this table. First of all, nearly 72% of gaps are between -0.5% to 0.5%. Although if a gap is 0%, it’s not actually a gap but let’s just put it in this category for simplification. Furthermore, gaps that are more than 2% have a very small sample size, so no real conclusions can be draw from them.

When we gap down from -0.5% to -2%, we tend to continue going lower. The data shows that the average % change from the Open to Close for this kind of gaps is from -0.08% to -0.15%. Furthermore, the % of times where we closed lower is from 50% to 52.7%. Consider that the stock market has the tendency to go up overtime, this is a good information to keep in mind, but it’s probably not an edge by itself.

The real edge is when we gap up from 0.5% to 2%, we have a strong tendency to continue higher at a high frequency. The data shows that the average % change from Open to Close for this kind of gaps is from 0.04% to 0.81%. The median % change from Open to Close are 0.13% to 0.72%. In addition, we closed higher than the Open more than 58% of the times!! That’s pretty significant.

Below is the equity curve for a strategy of buying SPY every time it gaps up 0.5% – 2% and exit at the end of the day.

Notice that prior to 2010, this strategy didn’t perform very well. However, since 2010, this strategy has performed nicely in this bull market, and the equity curve is very smooth. This probably means that this tendency doesn’t really exist prior to 2010.

Here are the stats for buying gaps between 0.5% to 2% on SPY from 2010 forward. You can see that the average % gain and median % gain as well as the win rate increased significantly over this period.

On a final note, can this strategy be trade by itself? Maybe. However, that is not how I would use it. For me, it’s a good tendency to understand and use for day trading. Knowing that when SPY gaps up 0.5% to 2%, we tend to continue going higher very often; I can have a clear upward directional bias for the day. I can use this piece of information to structure my entries around the OPEN price. I can trade breakout more aggressive to the upside or buying pullback after the Open with much more confident. When SPY gaps up 0.5% to 2%, my focus will be mainly on going long instead of going short.

Disclaimer: I am not a financial advisor. All information in this blog is for educational purposes only. 

Average True Range – The Secrets That Professional Day Traders Don’t Want You to Know.

Average true range (ATR) is a volatility indicator originally developed by J. Welles Wilder. The indicator describes the degree of price volatility. The higher the ATR, the more volatile a stock is.

We often hear traders talk about ATR, but how do we use it? Well, today I will share with you how I use ATR for day trading.

Below is a study I did on SPY with a 10-days ATR. You can use a 14-period or 20-period ATR, it won’t make a big difference. The data is on SPY from January 2000 to February 2022. In this study, I used the previous day’s ATR to compare it against the intraday range. This is because going into the day, we only know the previous day’s ATR. Here is the breakdown of the % of days when the intraday range is less than or equal the daily’s 10 period ATR.

As you can see from the figure above, 66% of the days have a range that is less than 1 ATR. Nearly 82% of the days have a range that is within 1.2 ATR! At 1.4 ATR, that captures more than 90% of the days!! In other words, daily ATR can be a very helpful tool for us to determine the high/low of the day.

So, how do we use daily ATR for day trading? For example, if you see that the S&P 500 has been selling off all morning. It is trading at the low of the day and its intraday range is currently 1.2 ATR at 12 PM. Knowing that nearly 82% of the days have a range of less than 1.2 ATR, and only 18% of days have a range of more than 1.2 ATR; It’s probably a good spot to be thinking about buying. There is still a lot of time left for the day, so more often than not there might be at least some kind of pullbacks from the S&P 500.

To summarize, ATR can be a very effective indicator to use for day trading. By itself, it is not nearly enough to take a trade. However, if you combine ATR with technical analysis and other edges, it can boost your results tremendously.

Disclaimer: I am not a financial advisor. All information in this blog is for educational purposes only. 

When does the S&P 500 usually set the high or low of the day?

For day traders, knowing the time of the day when the market sets the high of the day (HOD) or low of the day (LOD) is very important. This information can be very useful in many aspects. From knowing when we should be participating in HOD/LOD breakout trades to when we should be fading them. Additionally, we can also use this information for stop placement. For example, if we know that by 2PM, there is a 68% chance that we already seen the low of the day, then we can use the low of the day as our stop loss and expect that 2 out of 3 times, we won’t get stop out. Of course, using this information in live trading is much more complicated than that. Nevertheless, it’s still a very useful information that can improve our results.

Below is the time of the day the S&P 500 makes new high/low. The data is from 2000 to February 2022. This data is breakdown by 30 minutes windows.

As you can see from the graph above, the HOD and LOD is usually set from the first 30 minutes and the last 30 minutes of the trading day. Another key information is that we rarely made new HOD/LOD during mid-day (from 12 – 2 PM).

Here is the cumulative % that we had seen HOD or LOW by different time of the day. For example, at 2 PM, there is a 61.71% chance that we have already seen the HOD and 67.98% chance that we have already seen LOD.

So how should one use the information above to improve their trading? Well, since we mostly make new high and low during the first 30 minutes and the last 30 minutes, we should look to take breakout during that timeframe. Conversely, we should be fading these HOD/LOD breakouts during mid-day as the data shows that it is very rare for the S&P 500 to makes new high/low during that time.

Disclaimer: I am not a financial advisor. All information in this blog is for educational purposes only.