Herron Topic 2 - Trading Strategies

FINA 6333 for Spring 2024

Author

Richard Herron

This notebook covers trading strategies based on technical analysis in three parts:

  1. What is technical analysis?
  2. Why might trading strategies based on technical analysis work (or not work)?
  3. Implement a simple moving average (SMA) trading strategy

I based this lecture notebook on chapter 12 of Ivo Welch’s Corporate Finance textbook and chapter 2 of Eryk Lewinson’s Python for Finance Cookbook. The practice notebook will cover several other trading strategies based on technical analysis.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pandas_datareader as pdr
import statsmodels.api as sm
import yfinance as yf
%precision 4
pd.options.display.float_format = '{:.4f}'.format
%config InlineBackend.figure_format = 'retina'

1 What is technical analysis?

Technical analysis is a methodology that analyzes past market data (e.g., past prices and volume) in an attempt to forecast future price movements. If technical analysis can predict future price movements, the market is not weak-form efficient. Ivo Welch provides the three degrees of market efficiency in section 12.2 of chapter 12 of his (free) Corporate Finance textbook:

The Traditional Classification The traditional definition of market efficiency focuses on information. In the traditional classification, market efficiency comes in one of three primary degrees: weak, semi-strong, and strong.

Weak market efficiency says that all information in past prices is reflected in today’s stock prices so that technical analysis (trading based solely on historical price patterns) cannot be used to beat the market. Put differently, the market is the best technical analyst.

Semistrong market efficiency says that all public information is reflected in today’s stock prices, so that neither fundamental trading (based on underlying firm fundamentals, such as cash flows or discount rates) nor technical analysis can be used to beat the market. Put differently, the market is both the best technical and the best fundamental analyst.

Strong market efficiency says that all information, both public and private, is reflected in today’s stock prices, so that nothing — not even private insider information — can be used to beat the market. Put differently, the market is the best analyst and cannot be beat.

In this traditional classification, all finance professors nowadays believe that most U.S. financial markets are not strong-form efficient: Insider trading may be illegal, but it works. However, there are still arguments as to which markets are only semi-strong-form efficient or even only weak-form efficient.

Section 12.2 goes on to provide Welch’s own taxonomy of true, firm, mild, and nonbelievers in market efficiency. Chapter 12 summarizes market efficiency, classical finance, behavioral finance, arbitrage, limits to arbitrage, and their consequences for managers and investors. We will focus on technical analysis in this notebook, but chapter 12 is excellent.

2 Why might trading strategies based on technical analysis work or not?

2.1 …Work?

Technical analysis relies on a few ideas:

  1. Market prices and volume reflect all relevant information, so we can focus on past prices and volume instead of fundamentals and news.
  2. Market prices move in trends and patterns driven by market participants.
  3. These trends and patterns tend to repeat themselves because market participants create them.

2.2 …Or Not?

The logic above is reasonable. However, if past market prices reflect all relevant information, they should also reflect any prices trends they predict. Therefore, any patterns should be self-defeating, and market prices should follow a random walk. As well, the signal-to-noise ratio in market prices is high! Still, technical analysis provides an opportunity to learn how to implement and back-test trading strategies in Python.

2.2.1 A Random Walk

In a random walk, the price tomorrow equals the price today plus a tiny drift plus noise. In math terms, a random walk is \[P_{t} = \rho P_{t-1} + m P_{t-1} + \varepsilon_t\] where \(m\) is a small drift term and \(E[\varepsilon] = 0\). If \(\rho > 1\), prices would quickly increase, and, if \(\rho < 1\), prices would quickly decrease. Let us examine the historical record.

ff = (
    pdr.DataReader(
        name='F-F_Research_Data_Factors_daily',
        data_source='famafrench',
        start='1900'
    )
    [0]
    .assign(Mkt=lambda x: x['Mkt-RF'] + x['RF'])
    .div(100)
)

ff.head()
C:\Users\r.herron\AppData\Local\Temp\ipykernel_22728\2916005334.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.
  pdr.DataReader(
Mkt-RF SMB HML RF Mkt
Date
1926-07-01 0.0010 -0.0025 -0.0027 0.0001 0.0011
1926-07-02 0.0045 -0.0033 -0.0006 0.0001 0.0046
1926-07-06 0.0017 0.0030 -0.0039 0.0001 0.0018
1926-07-07 0.0009 -0.0058 0.0002 0.0001 0.0010
1926-07-08 0.0021 -0.0038 0.0019 0.0001 0.0022

We can use market returns to impute market prices relative to the last day of June 1926.

price = ff['Mkt'].add(1).cumprod()

price.tail()
Date
2024-01-25   11969.2170
2024-01-26   11969.4564
2024-01-29   12073.8301
2024-01-30   12060.7903
2024-01-31   11853.5860
Name: Mkt, dtype: float64
price.plot()
plt.title('Imputed Market Prices\nAssuming $1 on the Last Day of June 1926')
plt.ylabel('Imputed Market Price ($)')
plt.semilogy()
plt.show()

We need lagged prices to estimate \(\rho\). We will add 10 lags of \(P\) to help us understand the relation between past and future prices.

price_lags = (
    pd.concat(
        objs=[price.shift(t) for t in range(11)],
        keys=[f'Lag {t}' for t in range(11)],
        names=['Price'],
        axis=1,
    )
)

price_lags.head()
Price Lag 0 Lag 1 Lag 2 Lag 3 Lag 4 Lag 5 Lag 6 Lag 7 Lag 8 Lag 9 Lag 10
Date
1926-07-01 1.0011 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1926-07-02 1.0057 1.0011 NaN NaN NaN NaN NaN NaN NaN NaN NaN
1926-07-06 1.0075 1.0057 1.0011 NaN NaN NaN NaN NaN NaN NaN NaN
1926-07-07 1.0085 1.0075 1.0057 1.0011 NaN NaN NaN NaN NaN NaN NaN
1926-07-08 1.0107 1.0085 1.0075 1.0057 1.0011 NaN NaN NaN NaN NaN NaN
(
    price_lags
    .dropna()
    .corr()
    .loc['Lag 0']
    .plot(kind='bar')
)
plt.title('Correlations with Imputed Market Price')
plt.ylabel('Correlation')
plt.show()

But these are pairwise correlations. If we estimate conditional correlations, we see that most of the price information is in the first lag!

_ = price_lags.dropna()
y = _['Lag 0']
X = _.drop('Lag 0', axis=1).pipe(sm.add_constant)
model = sm.OLS(endog=y, exog=X)
fit = model.fit(cov_type='HAC', cov_kwds={'maxlags': 10})
fit.summary()
OLS Regression Results
Dep. Variable: Lag 0 R-squared: 1.000
Model: OLS Adj. R-squared: 1.000
Method: Least Squares F-statistic: 1.848e+06
Date: Fri, 15 Mar 2024 Prob (F-statistic): 0.00
Time: 11:16:38 Log-Likelihood: -1.2242e+05
No. Observations: 25660 AIC: 2.449e+05
Df Residuals: 25649 BIC: 2.450e+05
Df Model: 10
Covariance Type: HAC
coef std err z P>|z| [0.025 0.975]
const 0.0590 0.143 0.412 0.680 -0.222 0.340
Lag 1 0.9473 0.035 27.080 0.000 0.879 1.016
Lag 2 0.0778 0.060 1.287 0.198 -0.041 0.196
Lag 3 -0.0342 0.038 -0.897 0.369 -0.109 0.041
Lag 4 -0.0275 0.043 -0.642 0.521 -0.112 0.057
Lag 5 0.0442 0.040 1.108 0.268 -0.034 0.122
Lag 6 -0.0657 0.043 -1.539 0.124 -0.149 0.018
Lag 7 0.1271 0.050 2.541 0.011 0.029 0.225
Lag 8 -0.1299 0.056 -2.315 0.021 -0.240 -0.020
Lag 9 0.1485 0.048 3.095 0.002 0.054 0.242
Lag 10 -0.0871 0.032 -2.705 0.007 -0.150 -0.024
Omnibus: 15996.444 Durbin-Watson: 1.993
Prob(Omnibus): 0.000 Jarque-Bera (JB): 4983693.037
Skew: -1.823 Prob(JB): 0.00
Kurtosis: 71.176 Cond. No. 8.90e+03


Notes:
[1] Standard Errors are heteroscedasticity and autocorrelation robust (HAC) using 10 lags and without small sample correction
[2] The condition number is large, 8.9e+03. This might indicate that there are
strong multicollinearity or other numerical problems.
plt.bar(
    x=price_lags.columns[1:],
    height=fit.params[1:],
    yerr=2*fit.bse[1:]
)
plt.title('Conditional Correlations with Imputed Market Price\nBars Indicate Plus-or-Minus Standard Errors')
plt.ylabel('Conditional Correlation')
plt.xlabel('Price')
plt.show()

2.2.2 Signal-to-Noise Ratio

Recall, we can express a random walk as \(P_{t} = \rho P_{t-1} + m P_{t-1} + \varepsilon_t\). Since \(\rho = 1\), we can subtract \(P_{t-1}\) from both sides, then divide by \(P_{t-1}\) on both sides. This transformation expresses a random walk in terms of returns: \(r_{t-1,t} = m + e_t\), where \(E[e_t] = 0\) and \(SD[e_t] = s\), so \(E[r_{t-1, t}] = m\). We can think of the signal-to-noise ratio as \(\frac{m}{s}\). How high is this ratio?

m, s = ff['Mkt'].mean(), ff['Mkt'].std()

Here \(m\) is about 4 basis points per day!

m
0.0004

However, \(s\) is about 108 basis points per day!

s
0.0108

Therefore, the signal-to-noise ratio is less than 4 percent!

m/s
0.0392

Recall that means grow linearly with time and standard deviations growth with the square-root of time. So, if we want \(\sqrt{t} \times \frac{m}{s} \geq 2\), we need \(t \geq \left(2 \times \frac{s}{m} \right)^2\) days! Even with market portfolio noise, which is diversified and low, we needat least a decade! During this decade, the true values of \(m\) and \(s\) can change!

(2 * s / m)**2 / 252
10.3088

3 Implement a simple moving average (SMA) trading strategy

The goal of technical analysis is to “buy low, and sell high.” The \(n\)-day SMA reduces noise in market prices, removing market fluctuations and providing estimates of “true” prices. While the market price is above the SMA, the SMA rises. While the market price is below the SMA, the SMA falls. So, if we buy the stock as it cross the SMA from below and sell the stock as it crosses the SMA from above, we mechanically buy low and sell high! Here, we will implement a long-only 20-day SMA (SMA(20)) strategy with Bitcoin:

  1. Buy when the closing price crosses SMA(20) from below
  2. Sell when the closing price crosses SMA(20) from above
  3. No short-selling

Because we will not short sell the stock, we can simplify this strategy to “long if above SMA(20), otherwise neutral”. First, we will need Bitcoin returns data.

btc = (
    yf.download(tickers='BTC-USD')
    .assign(Return = lambda x: x['Adj Close'].pct_change())
    .rename_axis(columns='Variable')
)

btc.head()
[*********************100%%**********************]  1 of 1 completed
Variable Open High Low Close Adj Close Volume Return
Date
2014-09-17 465.8640 468.1740 452.4220 457.3340 457.3340 21056800 NaN
2014-09-18 456.8600 456.8600 413.1040 424.4400 424.4400 34483200 -0.0719
2014-09-19 424.1030 427.8350 384.5320 394.7960 394.7960 37919700 -0.0698
2014-09-20 394.6730 423.2960 389.8830 408.9040 408.9040 36863600 0.0357
2014-09-21 408.0850 412.4260 393.1810 398.8210 398.8210 26580100 -0.0247

Next we:

  1. Use .rolling(20).mean() to add a SMA20 column containing SMA(20) to our btc data frame
  2. Use np.select() to add a Position column containing:
    1. 1 (long) when the adjusted close is greater than SMA(20)
    2. 0 (neutral) when the adjusted close is less than (or equal to) SMA(20)
    3. We could use np.where() instead of np.select(), but using np.select() provides a more flexible framework for more complex examples
    4. We use .shift() to compare yesterday’s closing prices, avoiding a look-ahead bias
  3. Add a Strategy column containing:
    1. Return if Position == 1
    2. 0 if Position == 0
    3. We could earn the risk-free rate instead of 0 percent, but earning 0 percent simplifies this example
btc = (
    btc
    .assign(
        SMA20 = lambda x: x['Adj Close'].rolling(20).mean(),
        Position = lambda x: np.select(
            condlist=[x['Adj Close'].shift() > x['SMA20'].shift(), x['Adj Close'].shift() <= x['SMA20'].shift()],
            choicelist=[1, 0],
            default=np.nan
        ),
        Strategy = lambda x: x['Position'] * x['Return']
    )
)

btc.head(30)
Variable Open High Low Close Adj Close Volume Return SMA20 Position Strategy
Date
2014-09-17 465.8640 468.1740 452.4220 457.3340 457.3340 21056800 NaN NaN NaN NaN
2014-09-18 456.8600 456.8600 413.1040 424.4400 424.4400 34483200 -0.0719 NaN NaN NaN
2014-09-19 424.1030 427.8350 384.5320 394.7960 394.7960 37919700 -0.0698 NaN NaN NaN
2014-09-20 394.6730 423.2960 389.8830 408.9040 408.9040 36863600 0.0357 NaN NaN NaN
2014-09-21 408.0850 412.4260 393.1810 398.8210 398.8210 26580100 -0.0247 NaN NaN NaN
2014-09-22 399.1000 406.9160 397.1300 402.1520 402.1520 24127600 0.0084 NaN NaN NaN
2014-09-23 402.0920 441.5570 396.1970 435.7910 435.7910 45099500 0.0836 NaN NaN NaN
2014-09-24 435.7510 436.1120 421.1320 423.2050 423.2050 30627700 -0.0289 NaN NaN NaN
2014-09-25 423.1560 423.5200 409.4680 411.5740 411.5740 26814400 -0.0275 NaN NaN NaN
2014-09-26 411.4290 414.9380 400.0090 404.4250 404.4250 21460800 -0.0174 NaN NaN NaN
2014-09-27 403.5560 406.6230 397.3720 399.5200 399.5200 15029300 -0.0121 NaN NaN NaN
2014-09-28 399.4710 401.0170 374.3320 377.1810 377.1810 23613300 -0.0559 NaN NaN NaN
2014-09-29 376.9280 385.2110 372.2400 375.4670 375.4670 32497700 -0.0045 NaN NaN NaN
2014-09-30 376.0880 390.9770 373.4430 386.9440 386.9440 34707300 0.0306 NaN NaN NaN
2014-10-01 387.4270 391.3790 380.7800 383.6150 383.6150 26229400 -0.0086 NaN NaN NaN
2014-10-02 383.9880 385.4970 372.9460 375.0720 375.0720 21777700 -0.0223 NaN NaN NaN
2014-10-03 375.1810 377.6950 357.8590 359.5120 359.5120 30901200 -0.0415 NaN NaN NaN
2014-10-04 359.8920 364.4870 325.8860 328.8660 328.8660 47236500 -0.0852 NaN NaN NaN
2014-10-05 328.9160 341.8010 289.2960 320.5100 320.5100 83308096 -0.0254 NaN NaN NaN
2014-10-06 320.3890 345.1340 302.5600 330.0790 330.0790 79011800 0.0299 389.9104 NaN NaN
2014-10-07 330.5840 339.2470 320.4820 336.1870 336.1870 49199900 0.0185 383.8530 0.0000 0.0000
2014-10-08 336.1160 354.3640 327.1880 352.9400 352.9400 54736300 0.0498 380.2780 0.0000 0.0000
2014-10-09 352.7480 382.7260 347.6870 365.0260 365.0260 83641104 0.0342 378.7895 0.0000 0.0000
2014-10-10 364.6870 375.0670 352.9630 361.5620 361.5620 43665700 -0.0095 376.4225 0.0000 -0.0000
2014-10-11 361.3620 367.1910 355.9510 362.2990 362.2990 13345200 0.0020 374.5964 0.0000 0.0000
2014-10-12 362.6060 379.4330 356.1440 378.5490 378.5490 17552800 0.0449 373.4162 0.0000 0.0000
2014-10-13 377.9210 397.2260 368.8970 390.4140 390.4140 35221400 0.0313 371.1474 1.0000 0.0313
2014-10-14 391.6920 411.6980 391.3240 400.8700 400.8700 38491500 0.0268 370.0306 1.0000 0.0268
2014-10-15 400.9550 402.2270 388.7660 394.7730 394.7730 25267100 -0.0152 369.1906 1.0000 -0.0152
2014-10-16 394.5180 398.8070 373.0700 382.5560 382.5560 26990000 -0.0309 368.0971 1.0000 -0.0309

I find it helpful to plot Adj Close, SMA20, and Position for a sort window with one or more crossings.

fig, ax = plt.subplots(2, 1, sharex=True)
_ = btc.loc['2023-02'].iloc[-15:]
_[['Adj Close', 'SMA20']].plot(ax=ax[0], ylabel='BTC-USD ($)')
_[['Position']].plot(ax=ax[1], ylabel='Position', legend=False)
plt.suptitle('Bitcoin SMA(20) Strategy')
plt.show()

We can compare the long-run performance of buy-and-hold and SMA(20).

_ = btc[['Return', 'Strategy']].dropna()

(
    _
    .add(1)
    .cumprod()
    .rename_axis(columns='Strategy')
    .rename(columns={'Return': 'Buy-And-Hold', 'Strategy': 'SMA(20)'})
    .plot()
)
plt.ylabel('Value ($)')
plt.title(f'Value of $1 Invested at Close on {_.index[0] - pd.offsets.Day(1):%B %d, %Y}')
plt.show()

In the practice notebook, we will dig deeper on this strategy and others.