Herron Topic 2 - Practice for Section 02

FINA 6333 for Spring 2024

Author

Richard Herron

1 Announcements

  1. Project 2 is due Friday, 3/29, at 11:59 PM
  2. We will dedicate class next week to group work and easy access to me (i.e., no pre-class quiz)

2 10-Minute Recap

  1. Technical analysis studies market trends to estimate where prices may go. It looks for patterns in past price moves and trade volumes, and traders use these signals to inform their trades.
  2. There is little evidence that technical analysis works, but it provides a context for us to introduce quantitative investing strategies.
  3. We should never base trades on information we do not have at the time of the trade!

3 Practice

import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
import pandas as pd
import pandas_datareader as pdr
import seaborn as sns
import statsmodels.api as sm
import yfinance as yf
%precision 4
pd.options.display.float_format = '{:.4f}'.format
%config InlineBackend.figure_format = 'retina'

3.1 Implement the SMA(20) strategy with Bitcoin from the lecture notebook

I suggest writing a function calc_sma() that accepts a data frame df and moving average window n. First, we need the data, from ticker BTC-USD from Yahoo! Finance.

btc = (
    yf.download('BTC-USD')
    .rename_axis(columns='Variable')
)

btc.head(10)
[*********************100%%**********************]  1 of 1 completed
Variable Open High Low Close Adj Close Volume
Date
2014-09-17 465.8640 468.1740 452.4220 457.3340 457.3340 21056800
2014-09-18 456.8600 456.8600 413.1040 424.4400 424.4400 34483200
2014-09-19 424.1030 427.8350 384.5320 394.7960 394.7960 37919700
2014-09-20 394.6730 423.2960 389.8830 408.9040 408.9040 36863600
2014-09-21 408.0850 412.4260 393.1810 398.8210 398.8210 26580100
2014-09-22 399.1000 406.9160 397.1300 402.1520 402.1520 24127600
2014-09-23 402.0920 441.5570 396.1970 435.7910 435.7910 45099500
2014-09-24 435.7510 436.1120 421.1320 423.2050 423.2050 30627700
2014-09-25 423.1560 423.5200 409.4680 411.5740 411.5740 26814400
2014-09-26 411.4290 414.9380 400.0090 404.4250 404.4250 21460800

As a side note, Bitcoin trading volume has increased 100,000 times over the past decade and is volatile!

(
    btc
    [['Volume', 'Close']]
    .prod(axis=1)
    .rolling(20)
    .mean()
    .plot(logy=True)
)

plt.ylabel('20-Day Rolling Mean of Trading Volume ($)')
plt.title('Bitcoin Trading Volume Over Time')
plt.show()


Next, we write a function, since we might want to calculate simple moving averages (SMAs) more than once. The following calc_sma() function accepts:

  1. A data frame df of daily values from yfinance.download()
  2. An integer n that specifies the number of trading days in the SMA window

And returns the original data frame df plus the following columns:

  1. Return for the daily returns
  2. SMA for the n-trading-day moving average
  3. Signal for the weight on the security for each day
  4. Strategy for the return on the SMA strategy
def calc_sma(df, n):
    return (
        df
        .assign(
            Return=lambda x: x['Adj Close'].pct_change(),
            SMA=lambda x: x['Adj Close'].rolling(n).mean(),
            Signal=lambda x: np.select(
                condlist=[x['Adj Close'].shift(1)>x['SMA'].shift(1), x['Adj Close'].shift(1)<=x['SMA'].shift(1)],
                choicelist=[1, 0],
                default=np.nan
            ),
            Strategy=lambda x: x['Signal'] * x['Return']
        )
    )

Finally, we calculate the SMA(20). We can drop the last row to avoid partial trading-day returns. Recall the last row from yfinance.download() may not be at the close.

btc_sma = btc.iloc[:-1].pipe(calc_sma, n=20)

btc_sma.tail()
Variable Open High Low Close Adj Close Volume Return SMA Signal Strategy
Date
2024-03-17 65316.3438 68845.7188 64545.3164 68390.6250 68390.6250 44716864318 0.0471 66530.1934 0.0000 0.0000
2024-03-18 68371.3047 68897.1328 66594.2266 67548.5938 67548.5938 49261579492 -0.0123 67053.3545 1.0000 -0.0123
2024-03-19 67556.1328 68106.9297 61536.1797 61912.7734 61912.7734 74215844794 -0.0834 67023.7537 1.0000 -0.0834
2024-03-20 61930.1562 68115.2578 60807.7852 67913.6719 67913.6719 66792634382 0.0969 67359.5182 0.0000 0.0000
2024-03-21 67911.5859 68199.9922 64580.9180 65491.3906 65491.3906 44480350565 -0.0357 67512.0561 1.0000 -0.0357

For Bitcoin, the SMA(20) strategy works well over the full sample!

names_sma = {'Return': 'Buy-And-Hold', 'Strategy': 'SMA(20)'}

_ = (
    btc_sma
    [['Return', 'Strategy']]
    .dropna()
)

(
    _
    .add(1)
    .cumprod()
    .rename_axis(columns='Strategy')
    .rename(columns=names_sma)
    .plot(logy=True)
)
plt.ylabel('Value ($)')
plt.title(f'Value of $1 Invested in Bitcoin Strategies\nat Close on {_.index[0] - pd.offsets.Day(1):%B %d, %Y}')
plt.show()

Here are the final values of $1 investments, which are difficult to read on the log scale above.

(
    btc_sma
    [['Return', 'Strategy']]
    .rename_axis(columns='Strategy')
    .rename(columns=names_sma)
    .add(1)
    .prod()
)
Strategy
Buy-And-Hold   143.2025
SMA(20)        217.6557
dtype: float64

3.2 How does SMA(20) outperform buy-and-hold with this sample?

Consider the following:

  1. Does SMA(20) avoid the worst performing days? How many of the worst 20 days does SMA(20) avoid? Try the .sort_values() or .nlargest() method.
  2. Does SMA(20) preferentially avoid low-return days? Try to combine the .groupby() method and pd.qcut() function.
  3. Does SMA(20) preferentially avoid high-volatility days? Try to combine the .groupby() method and pd.qcut() function.

The SMA(20) does well here because it avoids 17 of the 20 worst days, without avoiding the best days.

btc_sma.loc[btc_sma['Return'].nsmallest(20).index, ['Signal']].value_counts()
Signal
0.0000    17
1.0000     3
Name: count, dtype: int64
btc_sma.loc[btc_sma['Return'].nlargest(20).index, ['Signal']].value_counts()
Signal
0.0000    10
1.0000    10
Name: count, dtype: int64
(
    btc_sma
    .rename(columns=names_sma)
    .groupby('Signal')
    [['Buy-And-Hold', 'SMA(20)']]
    .describe()
    .rename_axis(columns=['Strategy', 'Statistic'])
    .stack('Strategy')
)
Statistic count mean std min 25% 50% 75% max
Signal Strategy
0.0000 Buy-And-Hold 1542.0000 0.0008 0.0401 -0.3717 -0.0138 0.0015 0.0163 0.2394
SMA(20) 1542.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
1.0000 Buy-And-Hold 1912.0000 0.0034 0.0339 -0.1409 -0.0112 0.0015 0.0175 0.2525
SMA(20) 1912.0000 0.0034 0.0339 -0.1409 -0.0112 0.0015 0.0175 0.2525

We can also use the seaborn package to visualize Signal (i.e., the portfolio weight on Bitcoin) during periods of high and low Bitcoin returns and volatility. The SMA(20) strategy is long Bitcoin about 55% of the time, with Bitcoin returns are high (bin 2) or low (bin 1).

(
    btc_sma
    [['Return', 'Signal']]
    .dropna()
    .assign(Return_Bin=lambda x: 1 + pd.qcut(x['Return'], q=2, labels=False))
    .pipe(
        sns.barplot,
        x='Return_Bin',
        y='Signal'
    )
)

plt.title('Percentage of Time SMA(20) Strategy is Long Bitcoin\nby Bitcoin Daily Return Bins')
plt.show()

(
    btc_sma
    [['Return', 'Signal']]
    .dropna()
    .assign(Volatility_Bin=lambda x: 1 + pd.qcut(x['Return'].rolling(20).std(), q=2, labels=False))
    .pipe(
        sns.barplot,
        x='Volatility_Bin',
        y='Signal'
    )
)

plt.title('Percentage of Time SMA(20) Strategy is Long Bitcoin\nby Bitcoin 20-Day Volatility Bins')
plt.show()

3.3 Implement the SMA(20) strategy with the market factor from French

We need to impute a price before we calculate SMA(20), because French does not provide a price we can use for technical analysis (TA).

ff = (
    pdr.DataReader(
        name='F-F_Research_Data_Factors_daily',
        data_source='famafrench',
        start='1900'
    )
    [0]
    .div(100)
    .assign(
        Mkt=lambda x: x['Mkt-RF'] + x['RF'],
        Price=lambda x: x['Mkt'].add(1).cumprod()
    )
)
C:\Users\r.herron\AppData\Local\Temp\ipykernel_25516\2105953079.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.
  pdr.DataReader(
ff_sma = (
    ff
    .rename(columns={'Price': 'Adj Close'})
    .pipe(calc_sma, n=20)
)

ff_sma.tail()
Mkt-RF SMB HML RF Mkt Adj Close Return SMA Signal Strategy
Date
2024-01-25 0.0046 0.0004 0.0056 0.0002 0.0048 11969.2170 0.0048 11716.9236 1.0000 0.0048
2024-01-26 -0.0002 0.0040 -0.0027 0.0002 0.0000 11969.4564 0.0000 11727.1312 1.0000 0.0000
2024-01-29 0.0085 0.0107 -0.0059 0.0002 0.0087 12073.8301 0.0087 11742.4928 1.0000 0.0087
2024-01-30 -0.0013 -0.0126 0.0084 0.0002 -0.0011 12060.7903 -0.0011 11759.6086 1.0000 -0.0011
2024-01-31 -0.0174 -0.0092 -0.0030 0.0002 -0.0172 11853.5860 -0.0172 11770.3368 1.0000 -0.0172
_ = ff_sma[['Return', 'Strategy']].dropna()

(
    _
    .add(1)
    .cumprod()
    .rename_axis(columns='Strategy')
    .rename(columns=names_sma)
    .plot(logy=True)
)
plt.ylabel('Value ($)')
plt.title(f'Value of $1 Invested in Market\nat Close on {_.index[0] - pd.offsets.BDay(1):%B %d, %Y}')
plt.show()

3.4 How often does SMA(20) outperform buy-and-hold with 1-year rolling windows with Bitcoin?

_ = (
    btc_sma
    [['Return', 'Strategy']]
    .pipe(np.log1p)
    .rolling(365)
    .sum()
    .pipe(np.expm1)
)

(
    _
    .dropna()
    .mul(100)
    .rename(columns=names_sma)
    .rename_axis(columns='Strategy')
    .plot()
)

plt.title('One-Year Rolling Returns for Bitcoin')
plt.ylabel('One-Year Rolling Return (%)')

print(
    f'{names_sma['Strategy']} > {names_sma['Return']} for One-Year Rolling Window:',
    (_['Strategy'] > _['Return']).value_counts(),
    sep='\n'
)
SMA(20) > Buy-And-Hold for One-Year Rolling Window:
False    2111
True     1363
Name: count, dtype: int64

3.5 Implement a long-only BB(20, 2) strategy with Bitcoin

More on Bollinger Bands here and here. In short, Bollinger Bands are bands around a trend, typically defined in terms of simple moving averages and volatilities. Here, long-only BB(20, 2) implies we have upper and lower bands at 2 standard deviations above and below SMA(20):

  1. Buy when the closing price crosses LB(20) from below, where LB(20) is SMA(20) minus 2 sigma
  2. Sell when the closing price crosses UB(20) from above, where UB(20) is SMA(20) plus 2 sigma
  3. No short-selling

The long-only BB(20, 2) is more difficult to implement than the long-only SMA(20) because we need to track buys and sells. For example, if the closing price is between LB(20) and BB(20), we need to know if our last trade was a buy or a sell. Further, if the closing price is below LB(20), we can still be long because we sell when the closing price crosses UB(20) from above.

Here is a function to calculate Bolling Bands and implement the strategy.

def calc_bb(df, m, n):
    return (
        btc
        .assign(
            Return=lambda x: x['Adj Close'].pct_change(),
            SMA=lambda x: x['Adj Close'].rolling(m).mean(), # m-day simple moving averge
            SMV=lambda x: x['Adj Close'].rolling(m).std(), # m-day simple moving volatility
            LB=lambda x: x['SMA'] - n*x['SMV'], # lower band is SMA - n*SMV
            UB=lambda x: x['SMA'] + n*x['SMV'], # upper band is SMA + n*SMV
            Signal_with_nan=lambda x: np.select(
                condlist=[
                    # long when price>LB yesterday but price<=LB two days ago
                    (x['Adj Close'].shift(1)>x['LB'].shift(1)) & (x['Adj Close'].shift(2)<=x['LB'].shift(2)),
                    # neutral when price<UB yesterday but price>=UB two days ago
                    (x['Adj Close'].shift(1)<x['UB'].shift(1)) & (x['Adj Close'].shift(2)>=x['UB'].shift(2)),
                ],
                choicelist=[
                    1,
                    0
                ],
                default=np.nan
            ),
            Signal=lambda x: x['Signal_with_nan'].ffill(), # carryforward last trade decisions
            Strategy=lambda x: x['Signal'] * x['Return']
        )
    )

Again, we drop the last day to omit partial returns.

btc_bb = btc.iloc[:-1].pipe(calc_bb, m=20, n=2)

btc_bb.tail()
Variable Open High Low Close Adj Close Volume Return SMA SMV LB UB Signal_with_nan Signal Strategy
Date
2024-03-18 68371.3047 68897.1328 66594.2266 67548.5938 67548.5938 49261579492 -0.0123 67053.3545 3615.2985 59822.7575 74283.9515 NaN 0.0000 -0.0000
2024-03-19 67556.1328 68106.9297 61536.1797 61912.7734 61912.7734 74215844794 -0.0834 67023.7537 3656.6873 59710.3790 74337.1284 NaN 0.0000 -0.0000
2024-03-20 61930.1562 68115.2578 60807.7852 67913.6719 67913.6719 66792634382 0.0969 67359.5182 3392.3919 60574.7343 74144.3020 NaN 0.0000 0.0000
2024-03-21 67911.5859 68199.9922 64580.9180 65491.3906 65491.3906 44480350565 -0.0357 67512.0561 3223.9829 61064.0903 73960.0218 NaN 0.0000 -0.0000
2024-03-22 65492.0781 66613.6719 62865.6641 62865.6641 62865.6641 42267766784 -0.0401 67553.8469 3153.8337 61246.1795 73861.5142 NaN 0.0000 -0.0000
names_bb = {'Return': 'Buy-And-Hold', 'Strategy': 'BB(20, 2)'}

_ = (
    btc_bb
    [['Return', 'Strategy']]
    .dropna()
)

(
    _
    .add(1)
    .cumprod()
    .rename_axis(columns='Strategy')
    .rename(columns=names_bb)
    .plot(logy=True)
)
plt.ylabel('Value ($)')
plt.title(f'Value of $1 Invested in Bitcoin\nat Close on {_.index[0] - pd.offsets.Day(1):%B %d, %Y}')
plt.show()

Here are the final values of $1 investments, which are difficult to read on the log scale above.

(
    btc_bb
    [['Return', 'Strategy']]
    .rename(columns=names_bb)
    .rename_axis(columns='Strategy')
    .add(1)
    .prod()
)
Strategy
Buy-And-Hold   137.4612
BB(20, 2)        1.8011
dtype: float64

It might be helpful to plot a few months of Bitcoin prices and signals to better understand the BB(20, 2) strategy

fig, ax = plt.subplots(nrows=2, ncols=1, sharex=True)
_ = btc_bb.loc['2023-02':'2023-03']

ax[0].plot(_[['Adj Close']], label='Price')
ax[0].plot(_[['SMA']], label='SMA(20)')
ax[0].plot(_[['UB']], label='LB and UB', color='green', linestyle=':')
ax[0].plot(_[['LB']], color='green', linestyle=':')
ax[0].legend()
ax[0].set_ylabel('Price ($)')

ax[1].plot(_[['Signal']], label='Signal')
ax[1].legend()
ax[1].set_ylabel('Signal')

ax[0].yaxis.set_major_formatter(ticker.StrMethodFormatter('{x:,.0f}'))
fig.autofmt_xdate()

plt.suptitle('Key Variables in Bitcoin BB(20, 2) Strategy')
plt.show()


3.6 Implement a long-short RSI(14) strategy with Bitcoin

From Fidelity:

The Relative Strength Index (RSI), developed by J. Welles Wilder, is a momentum oscillator that measures the speed and change of price movements. The RSI oscillates between zero and 100. Traditionally the RSI is considered overbought when above 70 and oversold when below 30. Signals can be generated by looking for divergences and failure swings. RSI can also be used to identify the general trend.

Here is the RSI formula: \(RSI(n) = 100 - \frac{100}{1 + RS(n)}\), where \(RS(n) = \frac{SMA(U, n)}{SMA(D, n)}\). For “up days”, \(U = \Delta Adj\ Close\) and \(D = 0\), and, for “down days”, \(U = 0\) and \(D = - \Delta Adj\ Close\). Therefore, \(U\) and \(D\) are always non-negative. We can learn more about RSI here.

We will implement a long-short RSI(14) as follows:

  1. Enter a long position when the RSI crosses 30 from below, and exit the position when the RSI crosses 50 from below
  2. Enter a short position when the RSI crosses 70 from above, and exit the position when the RSI crosses 50 from above
def calc_rsi(df, n, lb, mb, ub):
    return df.assign(
        Return=lambda x: x['Adj Close'].pct_change(),
        Diff=lambda x: x['Adj Close'].diff(),
        U=lambda x: np.select(
            condlist=[x['Diff'] >= 0, x['Diff'] < 0],
            choicelist=[x['Diff'], 0],
            default=np.nan
        ),
        D=lambda x: np.select(
            condlist=[x['Diff'] <= 0, x['Diff'] > 0],
            choicelist=[-1 * x['Diff'], 0],
            default=np.nan
        ),
        SMAU=lambda x: x['U'].rolling(n).mean(),
        SMAD=lambda x: x['D'].rolling(n).mean(),
        RS=lambda x: x['SMAU'] / x['SMAD'],
        RSI=lambda x: 100 - 100 / (1 + x['RS']),
        Signal_with_nan = lambda x: np.select(
            condlist=[
                (x['RSI'].shift(1) >= lb) & (x['RSI'].shift(2) < lb), 
                (x['RSI'].shift(1) >= mb) & (x['RSI'].shift(2) < mb),
                (x['RSI'].shift(1) <= ub) & (x['RSI'].shift(2) > ub), 
                (x['RSI'].shift(1) <= mb) & (x['RSI'].shift(2) > mb),
            ],
            choicelist=[
                1, 
                0,
                -1,
                0
            ],
            default=np.nan
        ),
        Signal=lambda x: x['Signal_with_nan'].ffill(),
        Strategy=lambda x: x['Signal'] * x['Return']
    )

Again, we drop the last day to omit partial returns.

btc_rsi = btc.iloc[:-1].pipe(calc_rsi, n=14, mb=2, lb=30, ub=70)

btc_rsi.tail()
Variable Open High Low Close Adj Close Volume Return Diff U D SMAU SMAD RS RSI Signal_with_nan Signal Strategy
Date
2024-03-17 65316.3438 68845.7188 64545.3164 68390.6250 68390.6250 44716864318 0.0471 3075.5078 3075.5078 0.0000 1297.3906 924.3011 1.4036 58.3965 NaN -1.0000 -0.0471
2024-03-18 68371.3047 68897.1328 66594.2266 67548.5938 67548.5938 49261579492 -0.0123 -842.0312 0.0000 842.0312 928.6018 984.4461 0.9433 48.5404 NaN -1.0000 0.0123
2024-03-19 67556.1328 68106.9297 61536.1797 61912.7734 61912.7734 74215844794 -0.0834 -5635.8203 0.0000 5635.8203 928.6018 1063.4894 0.8732 46.6144 NaN -1.0000 0.0834
2024-03-20 61930.1562 68115.2578 60807.7852 67913.6719 67913.6719 66792634382 0.0969 6000.8984 6000.8984 0.0000 1192.5513 1063.4894 1.1214 52.8604 NaN -1.0000 -0.0969
2024-03-21 67911.5859 68199.9922 64580.9180 65491.3906 65491.3906 44480350565 -0.0357 -2422.2812 0.0000 2422.2812 1134.0742 1236.5095 0.9172 47.8395 NaN -1.0000 0.0357
names_rsi = {'Return': 'Buy-And-Hold', 'Strategy': 'RSI(14)'}

_ = btc_rsi[['Return', 'Strategy']].dropna()

(
    _
    .add(1)
    .cumprod()
    .rename_axis(columns='Strategy')
    .rename(columns=names_rsi)
    .plot(logy=True)
)

plt.ylabel('Value ($)')
plt.title(f'Value of $1 Invested in Bitcoin Strategies\nat Close on {_.index[0] - pd.offsets.Day(1):%B %d, %Y}')
plt.show()

Here are the final values of $1 investments, which are difficult to read on the log scale above.

(
    btc_rsi
    [['Return', 'Strategy']]
    .rename(columns=names_sma)
    .rename_axis(columns='Strategy')
    .add(1)
    .prod()
)
Strategy
Buy-And-Hold   143.2025
SMA(20)          0.0002
dtype: float64

Shorting Bitcoin has been dangerous!

We can compare all three TA strategies!

_ = (
    btc_sma[['Return', 'Strategy']]
    .join(
        btc_bb[['Strategy']].add_suffix('_BB'), 
    )
    .join(
        btc_rsi[['Strategy']].add_suffix('_RSI'), 
    )
    .dropna()
)


(
    _
    .add(1)
    .cumprod()
    .rename_axis(columns='Strategy')
    .rename(columns=
            {
                'Return': 'Buy-And-Hold', 
                'Strategy': 'SMA(20)',
                'Strategy_BB': 'BB(20, 2)',
                'Strategy_RSI': 'RSI(14)',
            }
           )
    .plot(logy=True)
)

plt.ylabel('Value ($)')
plt.title(f'Value of $1 Invested in Bitcoin Strategies\nat Close on {_.index[0] - pd.offsets.Day(1):%B %d, %Y}')
plt.show()