Herron Topic 1 - Practice for Section 03

FINA 6333 for Spring 2024

Author

Richard Herron

1 Announcements

DataCamp
1. Data Manipulation with pandas due by Friday, 2/9, at 11:59 PM
2. Joining Data with pandas due by Friday, 2/16, at 11:59 PM
3. Earn 10,000 XP due by Friday, 3/15, at 11:59 PM
I posted Project 1 to Canvas
1. Slides and notebook due by Friday, 2/23, at 11:59 PM
2. Keep joining teams and let me know if you need help

2 10-Minute Recap

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pandas_datareader as pdr
import yfinance as yf

%precision 4
pd.options.display.float_format = '{:.4f}'.format
%config InlineBackend.figure_format = 'retina'

First, we can easily download data in Python with two packages:

yfinance to download data from Yahoo! Finance
pandas-datareader to download data from Ken French (and FRED and others)

Second, there are “simple returns” and “log returns”

Simple returns are the returns that investors receive that we learned in FINA 6331 and FINA 6333: \(r_t = \frac{p_t + d_t - p_{t-1}}{p_{t-1}}\)
Log returns are the log of one plus simple returns. Why do we use them?
1. Log returns are additive, while simple returns are multiplicative. This additive property makes math really easy with log returns: \(\log(\prod_{t=0}^T (1 + r_t)) = \sum_{t=0}^T \log(1+r_t)\), so \(r_{0,T} = \prod_{t=0}^T (1 + r_t) - 1 = e^{\sum_{t=0}^T \log(1+r_t)} - 1\)
2. Log returns are almost normally distributed

We will almost always use simple returns. The exeception is time-consuming calculations, which we will often do in log returns to save us time.

Third, we can easily calculate portfolio returns in pandas!

returns.mean(axis=1) is the equally-weighted portfolio return for the stocks in returns, rebalanced at the same frequency as the returns in returns!
returns.dot(weights) lets us use weights in array weights instead of equally-weighted, but is still rebalanced at the same frequency as the returns in returns!

3 Practice

3.1 Download all available daily price data for tickers TSLA, F, AAPL, AMZN, and META to data frame `prices`

tickers = 'TSLA F AAPL AMZN META'
prices = yf.download(tickers=tickers)

prices

[*********************100%%**********************]  5 of 5 completed

	Adj Close					Close					...	Open					Volume
	AAPL	AMZN	F	META	TSLA	AAPL	AMZN	F	META	TSLA	...	AAPL	AMZN	F	META	TSLA	AAPL	AMZN	F	META	TSLA
Date
1972-06-01	NaN	NaN	0.2419	NaN	NaN	NaN	NaN	2.1532	NaN	NaN	...	NaN	NaN	0.0000	NaN	NaN	NaN	NaN	1091238	NaN	NaN
1972-06-02	NaN	NaN	0.2414	NaN	NaN	NaN	NaN	2.1492	NaN	NaN	...	NaN	NaN	2.1532	NaN	NaN	NaN	NaN	1174468	NaN	NaN
1972-06-05	NaN	NaN	0.2414	NaN	NaN	NaN	NaN	2.1492	NaN	NaN	...	NaN	NaN	2.1492	NaN	NaN	NaN	NaN	5209582	NaN	NaN
1972-06-06	NaN	NaN	0.2387	NaN	NaN	NaN	NaN	2.1248	NaN	NaN	...	NaN	NaN	2.1492	NaN	NaN	NaN	NaN	1424158	NaN	NaN
1972-06-07	NaN	NaN	0.2373	NaN	NaN	NaN	NaN	2.1127	NaN	NaN	...	NaN	NaN	2.1248	NaN	NaN	NaN	NaN	675088	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2024-02-05	187.6800	170.3100	11.5900	459.4100	181.0600	187.6800	170.3100	11.5900	459.4100	181.0600	...	188.1500	170.2000	12.0100	469.8800	184.2600	69668800.0000	55081300.0000	81026800	40832400.0000	134294400.0000
2024-02-06	189.3000	169.1500	12.0700	454.7200	185.1000	189.3000	169.1500	12.0700	454.7200	185.1000	...	186.8600	169.3900	11.6400	464.0000	177.2100	43490800.0000	42505500.0000	98636800	21655200.0000	122676000.0000
2024-02-07	189.4100	170.5300	12.8000	469.5900	187.5800	189.4100	170.5300	12.8000	469.5900	187.5800	...	190.6400	169.4800	12.7300	458.0000	188.1800	53439000.0000	47174100.0000	137224000	23066000.0000	111535200.0000
2024-02-08	188.3200	169.8400	12.8300	470.0000	189.5600	188.3200	169.8400	12.8300	470.0000	189.5600	...	189.3900	169.6500	12.8700	468.3200	189.0000	40962000.0000	42316500.0000	68653900	18815100.0000	83034000.0000
2024-02-09	188.8500	174.4500	12.6800	468.1100	193.5700	188.8500	174.4500	12.6800	468.1100	193.5700	...	188.6500	170.9000	12.8100	472.9500	190.1800	43728065.0000	56886544.0000	46925766	18159624.0000	84009121.0000

13035 rows × 30 columns

3.2 Calculate all available daily returns and save to data frame `returns`

returns = (
    prices['Adj Close'] # slice adj close
    .iloc[:-1] # drop the last price because it might be intraday (i.e., not a close)
    .pct_change() # calculate simple returns
)

returns

	AAPL	AMZN	F	META	TSLA
Date
1972-06-01	NaN	NaN	NaN	NaN	NaN
1972-06-02	NaN	NaN	-0.0019	NaN	NaN
1972-06-05	NaN	NaN	0.0000	NaN	NaN
1972-06-06	NaN	NaN	-0.0113	NaN	NaN
1972-06-07	NaN	NaN	-0.0057	NaN	NaN
...	...	...	...	...	...
2024-02-02	-0.0054	0.0787	0.0033	0.2032	-0.0050
2024-02-05	0.0098	-0.0087	-0.0453	-0.0328	-0.0365
2024-02-06	0.0086	-0.0068	0.0414	-0.0102	0.0223
2024-02-07	0.0006	0.0082	0.0605	0.0327	0.0134
2024-02-08	-0.0058	-0.0040	0.0023	0.0009	0.0106

13034 rows × 5 columns

3.3 Slices returns for the 2020s and assign to `returns_2020s`

returns_2020s = returns.loc['2020':] # always use an unambiguos date format, like YYYY-MM-DD

returns_2020s

	AAPL	AMZN	F	META	TSLA
Date
2020-01-02	0.0228	0.0272	0.0129	0.0221	0.0285
2020-01-03	-0.0097	-0.0121	-0.0223	-0.0053	0.0296
2020-01-06	0.0080	0.0149	-0.0054	0.0188	0.0193
2020-01-07	-0.0047	0.0021	0.0098	0.0022	0.0388
2020-01-08	0.0161	-0.0078	0.0000	0.0101	0.0492
...	...	...	...	...	...
2024-02-02	-0.0054	0.0787	0.0033	0.2032	-0.0050
2024-02-05	0.0098	-0.0087	-0.0453	-0.0328	-0.0365
2024-02-06	0.0086	-0.0068	0.0414	-0.0102	0.0223
2024-02-07	0.0006	0.0082	0.0605	0.0327	0.0134
2024-02-08	-0.0058	-0.0040	0.0023	0.0009	0.0106

1033 rows × 5 columns

3.4 Download all available data for the Fama and French daily benchmark factors to dictionary `ff_all`

I often use the following code snippet to find the exact name for the the daily benchmark factors file.

pdr.famafrench.get_available_datasets()[:5]

['F-F_Research_Data_Factors',
 'F-F_Research_Data_Factors_weekly',
 'F-F_Research_Data_Factors_daily',
 'F-F_Research_Data_5_Factors_2x3',
 'F-F_Research_Data_5_Factors_2x3_daily']

ff_all = pdr.DataReader(
    name='F-F_Research_Data_Factors_daily',
    data_source='famafrench',
    start='1900'
)

C:\Users\r.herron\AppData\Local\Temp\ipykernel_27592\2526882917.py:1: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.
  ff_all = pdr.DataReader(

The DESCR key in the dictionary tells us about the data frames that pandas-datareader returns.

print(ff_all['DESCR'])

F-F Research Data Factors daily
-------------------------------

This file was created by CMPT_ME_BEME_RETS_DAILY using the 202312 CRSP database. The Tbill return is the simple daily rate that, over the number of trading days in the month, compounds to 1-month TBill rate from Ibbotson and Associates Inc. Copyright 2023 Kenneth R. French

  0 : (25649 rows x 4 cols)

3.5 Slice the daily benchmark factors, convert them to decimal returns, and assign to `ff`

ff = ff_all[0].div(100)

ff

	Mkt-RF	SMB	HML	RF
Date
1926-07-01	0.0010	-0.0025	-0.0027	0.0001
1926-07-02	0.0045	-0.0033	-0.0006	0.0001
1926-07-06	0.0017	0.0030	-0.0039	0.0001
1926-07-07	0.0009	-0.0058	0.0002	0.0001
1926-07-08	0.0021	-0.0038	0.0019	0.0001
...	...	...	...	...
2023-12-22	0.0021	0.0064	0.0009	0.0002
2023-12-26	0.0048	0.0069	0.0046	0.0002
2023-12-27	0.0016	0.0014	0.0012	0.0002
2023-12-28	-0.0001	-0.0036	0.0003	0.0002
2023-12-29	-0.0043	-0.0112	-0.0037	0.0002

25649 rows × 4 columns

3.6 Use the `.cumprod()` method to plot cumulative returns for these stocks in the 2020s

We use the .prod() method to calculate total returns, because \(r_{total} = r_{0,T} = \left[ \prod_{t=0}^T (1 + r_t) \right] -1\).

(
    returns_2020s # returns during the 2020s
    .add(1) # add 1 before we compound
    .prod() # compound all returns
    .sub(1) # subtract 1 to recover total returns
)

AAPL   1.6331
AMZN   0.8383
F      0.6090
META   1.2899
TSLA   5.7970
dtype: float64

cumret_cumprod =  (
    returns_2020s
    .add(1)
    .cumprod()
    .sub(1)
)

cumret_cumprod.mul(100).plot()

# https://stackoverflow.com/questions/25973581/how-to-format-axis-number-format-to-thousands-with-a-comma
from matplotlib import ticker
plt.gca().get_yaxis().set_major_formatter(ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

plt.ylabel('Cumulative Return (%)')
plt.title(f'Cumulative Returns\n from {returns_2020s.index.min():%b %Y} to {returns_2020s.index.max():%b %Y}')
plt.show()

3.7 Use the `.cumsum()` method with log returns to plot cumulative returns for these stocks in the 2020s

cumret_cumsum =  (
    returns_2020s # start with simple returns
    .pipe(np.log1p) # convert to log returns
    .cumsum() # log returns are additive
    .pipe(np.expm1) # convert back to simple returns
)

np.allclose(cumret_cumsum, cumret_cumprod)

True

np.allclose(1*10**-9, 1.1*10**-9)

True

cumret_cumsum.mul(100).plot()

# https://stackoverflow.com/questions/25973581/how-to-format-axis-number-format-to-thousands-with-a-comma
from matplotlib import ticker
plt.gca().get_yaxis().set_major_formatter(ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

plt.ylabel('Cumulative Return (%)')
plt.title(f'Cumulative Returns\n from {returns_2020s.index.min():%b %Y} to {returns_2020s.index.max():%b %Y}')
plt.show()

3.8 Use price data only to plot cumulative returns for these stocks in the 2020s

We can also calculate cumulative returns as the ratio of adjusted closes. That is \(R_{0,T} = \frac{AC_T}{AC_0} - 1\). The trick here is that \(FV_t = PV (1+r)^t\), so \((1+r)^t = \frac{FV_t}{PV}\).

returns_2020s.iloc[0]

AAPL   0.0228
AMZN   0.0272
F      0.0129
META   0.0221
TSLA   0.0285
Name: 2020-01-02 00:00:00, dtype: float64

cumret_prices = prices['Adj Close'].loc['2020':].iloc[:-1] / prices['Adj Close'].loc['2019'].iloc[-1] - 1

np.allclose(cumret_prices, cumret_cumprod)

True

cumret_prices.mul(100).plot()

# https://stackoverflow.com/questions/25973581/how-to-format-axis-number-format-to-thousands-with-a-comma
from matplotlib import ticker
plt.gca().get_yaxis().set_major_formatter(ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

plt.ylabel('Cumulative Return (%)')
plt.title(f'Cumulative Returns\n from {returns_2020s.index.min():%b %Y} to {returns_2020s.index.max():%b %Y}')
plt.show()

3.9 Calculate the Sharpe Ratio for TSLA

Calculate the Sharpe Ratio with all available returns and 2020s returns. Recall the Sharpe Ratio is \(\frac{\overline{r_i - r_f}}{\sigma_i}\), where \(\sigma_i\) is the volatility of excess returns.

I suggest you write a function named calc_sharpe() to use for the rest of this notebook.

def calc_sharpe(ri, rf=ff['RF'], ppy=252):
    ri_rf = ri.sub(rf).dropna()
    return np.sqrt(ppy) * ri_rf.mean() / ri_rf.std()

calc_sharpe(ri=returns['TSLA'])

0.9261

calc_sharpe(ri=returns_2020s['TSLA'])

1.1212

Or, we can use the .pipe() method because ri is the first argument to calc_sharpe()!

returns['TSLA'].pipe(calc_sharpe)

0.9261

returns_2020s['TSLA'].pipe(calc_sharpe)

1.1212

3.10 Calculate the market beta for TSLA

Calculate the market beta with all available returns and 2020s returns. Recall we estimate market beta with the ordinary least squares (OLS) regression \(R_i-R_f = \alpha + \beta (R_m-R_f) + \epsilon\). We can estimate market beta with the covariance formula (i.e., \(\beta_i = \frac{Cov(R_i - R_f, R_m - R_f)}{Var(R_m-R_f)}\)) above for a univariate regression if we do not need goodness of fit statistics.

I suggest you write a function named calc_beta() to use for the rest of this notebook.

def calc_beta(ri, rf=ff['RF'], rm_rf=ff['Mkt-RF']):
    ri_rf = ri.sub(rf).dropna()
    return ri_rf.cov(rm_rf) / rm_rf.loc[ri_rf.index].var()

calc_beta(ri=returns['TSLA'])

1.4417

calc_beta(ri=returns_2020s['TSLA'])

1.5780

Or, we can use the .pipe() method because ri is the first argument to calc_sharpe()!

returns['TSLA'].pipe(calc_beta)

1.4417

returns_2020s['TSLA'].pipe(calc_beta)

1.5780

3.11 Guess the Sharpe Ratios for these stocks in the 2020s

3.12 Guess the market betas for these stocks in the 2020s

3.13 Calculate the Sharpe Ratios for these stocks in the 2020s

How good were your guesses?

for i in returns_2020s:
    sharpe_i = returns_2020s[i].pipe(calc_sharpe)
    print(f'Sharpe Ratio for {i}:\t {sharpe_i:0.2f}')

Sharpe Ratio for AAPL:   0.86
Sharpe Ratio for AMZN:   0.48
Sharpe Ratio for F:  0.42
Sharpe Ratio for META:   0.49
Sharpe Ratio for TSLA:   1.12

We can also use pandas notation to vectorize this calculation. First calculate excess returns as \(r_i - r_f\).

returns_2020s_excess = returns_2020s.sub(ff['RF'], axis=0).dropna()

Then use pandas notation to calculate means, standard deviations, and annualize.

(
    returns_2020s_excess
    .mean()
    .div(returns_2020s_excess.std())
    .mul(np.sqrt(252))
)

AAPL   0.8576
AMZN   0.4750
F      0.4240
META   0.4949
TSLA   1.1212
dtype: float64

Note: In a few weeks we will learn the .apply() method, which avoids the loop syntax.

returns_2020s.apply(calc_sharpe)

AAPL   0.8576
AMZN   0.4750
F      0.4240
META   0.4949
TSLA   1.1212
dtype: float64

3.14 Calculate the market betas for these stocks in the 2020s

How good were your guesses?

for i in returns_2020s:
    beta_i = returns_2020s[i].pipe(calc_beta)
    print(f'Beta for {i}:\t {beta_i:0.2f}')

Beta for AAPL:   1.15
Beta for AMZN:   1.04
Beta for F:  1.22
Beta for META:   1.27
Beta for TSLA:   1.58

Or we can follow out approach above to vectorize this calculation. First, we need to add a market excess return column to returns_2020s_excess.

returns_2020s_excess['Mkt-RF'] = ff['Mkt-RF']
returns_2020s_excess.head()

	AAPL	AMZN	F	META	TSLA	Mkt-RF
Date
2020-01-02	0.0228	0.0271	0.0128	0.0220	0.0285	0.0086
2020-01-03	-0.0098	-0.0122	-0.0224	-0.0054	0.0296	-0.0067
2020-01-06	0.0079	0.0148	-0.0055	0.0188	0.0192	0.0036
2020-01-07	-0.0048	0.0020	0.0098	0.0021	0.0387	-0.0019
2020-01-08	0.0160	-0.0079	-0.0001	0.0101	0.0491	0.0047

vcv = returns_2020s_excess.cov()
vcv

	AAPL	AMZN	F	META	TSLA	Mkt-RF
AAPL	0.0004	0.0003	0.0002	0.0004	0.0005	0.0003
AMZN	0.0003	0.0006	0.0002	0.0004	0.0005	0.0002
F	0.0002	0.0002	0.0009	0.0003	0.0005	0.0003
META	0.0004	0.0004	0.0003	0.0009	0.0005	0.0003
TSLA	0.0005	0.0005	0.0005	0.0005	0.0018	0.0003
Mkt-RF	0.0003	0.0002	0.0003	0.0003	0.0003	0.0002

vcv['Mkt-RF'].div(vcv.loc['Mkt-RF', 'Mkt-RF']).plot(kind='bar')
plt.xlabel('Ticker')
plt.ylabel('CAPM Beta')
plt.title('CAPM Betas')
plt.show()

Note: In a few weeks we will learn the .apply() method, which avoids the loop syntax.

returns_2020s.apply(calc_beta)

AAPL   1.1541
AMZN   1.0429
F      1.2231
META   1.2710
TSLA   1.5780
dtype: float64

3.15 Calculate the Sharpe Ratio for an equally weighted portfolio of these stocks in the 2020s

What do you notice?

returns_2020s.mean(axis=1).pipe(calc_sharpe)

0.9573

Because diversification reduces portfolio standard deviation less than the sum of its parts, the Sharpe Ratio of the equally weighted portfolio is less than the equally weighted mean of the single-stock Sharpe Ratios.

returns_2020s.apply(calc_sharpe).mean()

0.6745

3.16 Calculate the market beta for an equally weighted portfolio of these stocks in the 2020s

What do you notice?

Beta measures nondiversifiable risk, so \(\beta_P = \sum w_i \beta_i\)!

returns_2020s.mean(axis=1).pipe(calc_beta)

1.2538

returns_2020s.apply(calc_beta).mean()

1.2538

3.17 Calculate the market betas for these stocks every calendar year for every possible year

Save these market betas to data frame betas. Our current Python knowledge limits us to a for-loop, but we will learn easier and faster approaches soon!

betas = pd.DataFrame(
    index=range(1972, 2024),
    columns=returns.columns
)

betas.columns.name = 'Ticker'
betas.index.name = 'Year'

betas.tail()

Ticker	AAPL	AMZN	F	META	TSLA
Year
2019	NaN	NaN	NaN	NaN	NaN
2020	NaN	NaN	NaN	NaN	NaN
2021	NaN	NaN	NaN	NaN	NaN
2022	NaN	NaN	NaN	NaN	NaN
2023	NaN	NaN	NaN	NaN	NaN

for i in betas.index: 
    for c in betas.columns:
        betas.at[i, c] = returns.loc[str(i), c].pipe(calc_beta)

betas.tail()

Ticker	AAPL	AMZN	F	META	TSLA
Year
2019	1.4751	1.2752	1.0733	1.2094	1.3262
2020	1.1174	0.6866	1.1052	0.9913	1.3041
2021	1.1957	0.9822	1.2396	1.2014	1.9891
2022	1.2386	1.5922	1.3824	1.6843	1.7414
2023	1.0369	1.4649	1.3947	1.6630	2.2218

3.18 Plot the time series of market betas

betas.plot()
plt.ylabel('CAPM Beta')
plt.title('CAPM Betas')
plt.show()