Herron Topic 1 - Practice for Section 05

FINA 6333 for Spring 2024

Author

Richard Herron

1 Announcements

DataCamp
1. Data Manipulation with pandas due by Friday, 2/9, at 11:59 PM
2. Joining Data with pandas due by Friday, 2/16, at 11:59 PM
3. Earn 10,000 XP due by Friday, 3/15, at 11:59 PM
I posted Project 1 to Canvas
1. Slides and notebook due by Friday, 2/23, at 11:59 PM
2. Keep joining teams and let me know if you need help

2 10-Minute Recap

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pandas_datareader as pdr
import yfinance as yf

%precision 4
pd.options.display.float_format = '{:.4f}'.format
%config InlineBackend.figure_format = 'retina'

First, we will use two packages to download data from the Web:

yfinance for Yahoo! Finance
pandas-datareader for Ken French (and FRED and many others)

Second, there are “simple returns” and “log returns”

Simple returns are the returns that investors receive that we learned in FINA 6331 and FINA 6333: \(r_t = \frac{p_t + d_t - p_{t-1}}{p_{t-1}}\)
Log returns are the log of one plus simple returns. Why do we use them?
1. Log returns are additive, while simple returns are multiplicative. This additive property makes math really easy with log returns: \(\log(\prod_{t=0}^T (1 + r_t)) = \sum_{t=0}^T \log(1+r_t)\), so \(r_{0,T} = \prod_{t=0}^T (1 + r_t) - 1 = e^{\sum_{t=0}^T \log(1+r_t)} - 1\)
2. Log returns are almost normally distributed

We will almost always use simple returns. The exeception is time-consuming calculations, which we will often do in log returns to save us time.

Third, we can calculate portfolio returns a few ways!

returns.mean(axis=1) is equally-weighted portfolio returns, rebalanced at the same frequency as the returns (i.e., rebalanced every return period)
returns.dot(weights) lets us use any weights in the weights array, rebalanced at the same frequency as the returns

3 Practice

3.1 Download all available daily price data for tickers TSLA, F, AAPL, AMZN, and META to data frame `prices`

tickers = 'TSLA F AAPL AMZN META'
prices = yf.download(tickers=tickers)

[*********************100%%**********************]  5 of 5 completed

prices.head()

	Adj Close					Close					...	Open					Volume
	AAPL	AMZN	F	META	TSLA	AAPL	AMZN	F	META	TSLA	...	AAPL	AMZN	F	META	TSLA	AAPL	AMZN	F	META	TSLA
Date
1972-06-01	NaN	NaN	0.2419	NaN	NaN	NaN	NaN	2.1532	NaN	NaN	...	NaN	NaN	0.0000	NaN	NaN	NaN	NaN	1091238	NaN	NaN
1972-06-02	NaN	NaN	0.2414	NaN	NaN	NaN	NaN	2.1492	NaN	NaN	...	NaN	NaN	2.1532	NaN	NaN	NaN	NaN	1174468	NaN	NaN
1972-06-05	NaN	NaN	0.2414	NaN	NaN	NaN	NaN	2.1492	NaN	NaN	...	NaN	NaN	2.1492	NaN	NaN	NaN	NaN	5209582	NaN	NaN
1972-06-06	NaN	NaN	0.2387	NaN	NaN	NaN	NaN	2.1248	NaN	NaN	...	NaN	NaN	2.1492	NaN	NaN	NaN	NaN	1424158	NaN	NaN
1972-06-07	NaN	NaN	0.2373	NaN	NaN	NaN	NaN	2.1127	NaN	NaN	...	NaN	NaN	2.1248	NaN	NaN	NaN	NaN	675088	NaN	NaN

5 rows × 30 columns

3.2 Calculate all available daily returns and save to data frame `returns`

returns = (
    prices['Adj Close'] # slice adj close
    .iloc[:-1] # drop the last price because it might be intraday (i.e., not a close)
    .pct_change() # calculate simple returns
)

returns

	AAPL	AMZN	F	META	TSLA
Date
1972-06-01	NaN	NaN	NaN	NaN	NaN
1972-06-02	NaN	NaN	-0.0019	NaN	NaN
1972-06-05	NaN	NaN	0.0000	NaN	NaN
1972-06-06	NaN	NaN	-0.0113	NaN	NaN
1972-06-07	NaN	NaN	-0.0057	NaN	NaN
...	...	...	...	...	...
2024-02-02	-0.0054	0.0787	0.0033	0.2032	-0.0050
2024-02-05	0.0098	-0.0087	-0.0453	-0.0328	-0.0365
2024-02-06	0.0086	-0.0068	0.0414	-0.0102	0.0223
2024-02-07	0.0006	0.0082	0.0605	0.0327	0.0134
2024-02-08	-0.0058	-0.0040	0.0023	0.0009	0.0106

13034 rows × 5 columns

3.3 Slices returns for the 2020s and assign to `returns_2020s`

returns_2020s = returns.loc['2020':] # always use an unambiguos date format, like YYYY-MM-DD

returns_2020s

	AAPL	AMZN	F	META	TSLA
Date
2020-01-02	0.0228	0.0272	0.0129	0.0221	0.0285
2020-01-03	-0.0097	-0.0121	-0.0223	-0.0053	0.0296
2020-01-06	0.0080	0.0149	-0.0054	0.0188	0.0193
2020-01-07	-0.0047	0.0021	0.0098	0.0022	0.0388
2020-01-08	0.0161	-0.0078	0.0000	0.0101	0.0492
...	...	...	...	...	...
2024-02-02	-0.0054	0.0787	0.0033	0.2032	-0.0050
2024-02-05	0.0098	-0.0087	-0.0453	-0.0328	-0.0365
2024-02-06	0.0086	-0.0068	0.0414	-0.0102	0.0223
2024-02-07	0.0006	0.0082	0.0605	0.0327	0.0134
2024-02-08	-0.0058	-0.0040	0.0023	0.0009	0.0106

1033 rows × 5 columns

3.4 Download all available data for the Fama and French daily benchmark factors to dictionary `ff_all`

I often use the following code snippet to find the exact name for the the daily benchmark factors file.

pdr.famafrench.get_available_datasets()[:5]

['F-F_Research_Data_Factors',
 'F-F_Research_Data_Factors_weekly',
 'F-F_Research_Data_Factors_daily',
 'F-F_Research_Data_5_Factors_2x3',
 'F-F_Research_Data_5_Factors_2x3_daily']

ff_all = pdr.DataReader(
    name='F-F_Research_Data_Factors_daily',
    data_source='famafrench',
    start='1900' # most data start in 1926-07-01, but 1900 is easier to remember and type
)

C:\Users\r.herron\AppData\Local\Temp\ipykernel_27720\4231759426.py:1: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.
  ff_all = pdr.DataReader(

The DESCR key in the dictionary tells us about the data frames that pandas-datareader returns.

print(ff_all['DESCR'])

F-F Research Data Factors daily
-------------------------------

This file was created by CMPT_ME_BEME_RETS_DAILY using the 202312 CRSP database. The Tbill return is the simple daily rate that, over the number of trading days in the month, compounds to 1-month TBill rate from Ibbotson and Associates Inc. Copyright 2023 Kenneth R. French

  0 : (25649 rows x 4 cols)

3.5 Slice the daily benchmark factors, convert them to decimal returns, and assign to `ff`

ff = ff_all[0].div(100)

ff

	Mkt-RF	SMB	HML	RF
Date
1926-07-01	0.0010	-0.0025	-0.0027	0.0001
1926-07-02	0.0045	-0.0033	-0.0006	0.0001
1926-07-06	0.0017	0.0030	-0.0039	0.0001
1926-07-07	0.0009	-0.0058	0.0002	0.0001
1926-07-08	0.0021	-0.0038	0.0019	0.0001
...	...	...	...	...
2023-12-22	0.0021	0.0064	0.0009	0.0002
2023-12-26	0.0048	0.0069	0.0046	0.0002
2023-12-27	0.0016	0.0014	0.0012	0.0002
2023-12-28	-0.0001	-0.0036	0.0003	0.0002
2023-12-29	-0.0043	-0.0112	-0.0037	0.0002

25649 rows × 4 columns

3.6 Use the `.cumprod()` method to plot cumulative returns for these stocks in the 2020s

We use the .prod() method to calculate total returns, because \(r_{total} = r_{0,T} = \left[ \prod_{t=0}^T (1 + r_t) \right] -1\).

(
    returns_2020s # returns during the 2020s
    .add(1) # add 1 before we compound
    .prod() # compound all returns
    .sub(1) # subtract 1 to recover total returns
)

AAPL   1.6331
AMZN   0.8383
F      0.6090
META   1.2899
TSLA   5.7970
dtype: float64

We use the .cumprod() to calculate cumulative returns, which are the total returns for every date between \(0\) and \(T\) (i.e., \(r_{0,t} \forall t \in {0, 1, \ldots T}\))

cumret_cumprod = (
    returns_2020s # returns during the 2020s
    .add(1) # add 1 before we compound
    .cumprod() # compound returns up to time t
    .sub(1) # subtract 1 to recover cumulative return at time t
)

cumret_cumprod

	AAPL	AMZN	F	META	TSLA
Date
2020-01-02	0.0228	0.0272	0.0129	0.0221	0.0285
2020-01-03	0.0129	0.0147	-0.0097	0.0167	0.0590
2020-01-06	0.0209	0.0298	-0.0151	0.0358	0.0794
2020-01-07	0.0161	0.0319	-0.0054	0.0381	0.1213
2020-01-08	0.0325	0.0239	-0.0054	0.0486	0.1764
...	...	...	...	...	...
2024-02-02	1.5985	0.8596	0.5224	1.3142	5.7379
2024-02-05	1.6241	0.8433	0.4535	1.2383	5.4922
2024-02-06	1.6468	0.8308	0.5136	1.2154	5.6371
2024-02-07	1.6483	0.8457	0.6052	1.2879	5.7260
2024-02-08	1.6331	0.8383	0.6090	1.2899	5.7970

1033 rows × 5 columns

cumret_cumprod.mul(100).plot()

# https://stackoverflow.com/questions/25973581/how-to-format-axis-number-format-to-thousands-with-a-comma
from matplotlib import ticker
plt.gca().get_yaxis().set_major_formatter(ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

plt.ylabel('Cumulative Return (%)')
plt.title(f'Cumulative Returns\n from {returns_2020s.index.min():%b %Y} to {returns_2020s.index.max():%b %Y}')
plt.show()

3.7 Use the `.cumsum()` method with log returns to plot cumulative returns for these stocks in the 2020s

cumret_cumsum = (
    returns_2020s # returns during the 2020s
    .add(1) # add 1 before we compound
    .pipe(np.log)
    .cumsum() # log returns are additive!
    .pipe(np.exp)
    .sub(1) # subtract 1 to recover cumulative return at time t
)

np.allclose(cumret_cumprod, cumret_cumsum)

True

cumret_cumsum.mul(100).plot()

# https://stackoverflow.com/questions/25973581/how-to-format-axis-number-format-to-thousands-with-a-comma
from matplotlib import ticker
plt.gca().get_yaxis().set_major_formatter(ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

plt.ylabel('Cumulative Return (%)')
plt.title(f'Cumulative Returns\n from {returns_2020s.index.min():%b %Y} to {returns_2020s.index.max():%b %Y}')
plt.show()

3.8 Use price data only to plot cumulative returns for these stocks in the 2020s

We can also calculate cumulative returns as the ratio of adjusted closes. That is \(R_{0,T} = \frac{AC_T}{AC_0} - 1\). The trick here is that \(FV_t = PV (1+r)^t\), so \((1+r)^t = \frac{FV_t}{PV}\).

returns_2020s.iloc[0]

AAPL   0.0228
AMZN   0.0272
F      0.0129
META   0.0221
TSLA   0.0285
Name: 2020-01-02 00:00:00, dtype: float64

prices['Adj Close'].loc['2020'].iloc[0]

AAPL    73.1526
AMZN    94.9005
F        8.0770
META   209.7800
TSLA    28.6840
Name: 2020-01-02 00:00:00, dtype: float64

prices['Adj Close'].loc['2019'].iloc[-1]

AAPL    71.5208
AMZN    92.3920
F        7.9741
META   205.2500
TSLA    27.8887
Name: 2019-12-31 00:00:00, dtype: float64

prices['Adj Close'].loc['2020'].iloc[0] / prices['Adj Close'].loc['2019'].iloc[-1] - 1

AAPL   0.0228
AMZN   0.0272
F      0.0129
META   0.0221
TSLA   0.0285
dtype: float64

Note: We drop the last row in prices['Adj Close'] with .iloc[:-1]. We drop this last row here here because we did the same above when we calculated returns to exclude possible intraday returns.

cumret_prices = prices['Adj Close'].loc['2020':].iloc[:-1] / prices['Adj Close'].loc['2019'].iloc[-1] - 1

np.allclose(cumret_prices, cumret_cumprod)

True

cumret_prices.mul(100).plot()

# https://stackoverflow.com/questions/25973581/how-to-format-axis-number-format-to-thousands-with-a-comma
from matplotlib import ticker
plt.gca().get_yaxis().set_major_formatter(ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

plt.ylabel('Cumulative Return (%)')
plt.title(f'Cumulative Returns\n from {returns_2020s.index.min():%b %Y} to {returns_2020s.index.max():%b %Y}')
plt.show()

3.9 Calculate the Sharpe Ratio for TSLA

Calculate the Sharpe Ratio with all available returns and 2020s returns. Recall the Sharpe Ratio is \(\frac{\overline{r_i - r_f}}{\sigma_i}\), where \(\sigma_i\) is the volatility of excess returns.

I suggest you write a function named calc_sharpe() to use for the rest of this notebook.

def calc_sharpe(ri, rf=ff['RF'], ppy=252):
    ri_rf = ri.sub(rf).dropna()
    return np.sqrt(ppy) * ri_rf.mean() / ri_rf.std()

calc_sharpe(ri=returns['TSLA'])

0.9261

calc_sharpe(ri=returns_2020s['TSLA'])

1.1212

We can use the .pipe() method here, too, since ri is the first argument to calc_sharpe()!

returns['TSLA'].pipe(calc_sharpe)

0.9261

returns_2020s['TSLA'].pipe(calc_sharpe)

1.1212

3.10 Calculate the market beta for TSLA

Calculate the market beta with all available returns and 2020s returns. Recall we estimate market beta with the ordinary least squares (OLS) regression \(R_i-R_f = \alpha + \beta (R_m-R_f) + \epsilon\). We can estimate market beta with the covariance formula (i.e., \(\beta_i = \frac{Cov(R_i - R_f, R_m - R_f)}{Var(R_m-R_f)}\)) above for a univariate regression if we do not need goodness of fit statistics.

I suggest you write a function named calc_beta() to use for the rest of this notebook.

def calc_beta(ri, rf=ff['RF'], rm_rf=ff['Mkt-RF']):
    ri_rf = ri.sub(rf).dropna()
    return ri_rf.cov(rm_rf) / rm_rf.loc[ri_rf.index].var()

calc_beta(ri=returns['TSLA'])

1.4417

calc_beta(ri=returns_2020s['TSLA'])

1.5780

We can use the .pipe() method here, too, since ri is the first argument to calc_beta()!

returns['TSLA'].pipe(calc_beta)

1.4417

returns_2020s['TSLA'].pipe(calc_beta)

1.5780

3.11 Guess the Sharpe Ratios for these stocks in the 2020s

3.12 Guess the market betas for these stocks in the 2020s

3.13 Calculate the Sharpe Ratios for these stocks in the 2020s

How good were your guesses?

for i in returns_2020s:
    sharpe_i = returns_2020s[i].pipe(calc_sharpe)
    print(f'Sharpe Ratio for {i}:\t {sharpe_i:0.2f}')

Sharpe Ratio for AAPL:   0.86
Sharpe Ratio for AMZN:   0.48
Sharpe Ratio for F:  0.42
Sharpe Ratio for META:   0.49
Sharpe Ratio for TSLA:   1.12

We can also use pandas notation to vectorize this calculation. First calculate excess returns as \(r_i - r_f\).

returns_2020s_excess = returns_2020s.sub(ff['RF'], axis=0).dropna()

Then use pandas notation to calculate means, standard deviations, and annualize.

(
    returns_2020s_excess
    .mean()
    .div(returns_2020s_excess.std())
    .mul(np.sqrt(252))
)

AAPL   0.8576
AMZN   0.4750
F      0.4240
META   0.4949
TSLA   1.1212
dtype: float64

Note: In a few weeks we will learn the .apply() method, which avoids the loop syntax.

returns_2020s.apply(calc_sharpe)

AAPL   0.8576
AMZN   0.4750
F      0.4240
META   0.4949
TSLA   1.1212
dtype: float64

3.14 Calculate the market betas for these stocks in the 2020s

How good were your guesses?

for i in returns_2020s:
    beta_i = returns_2020s[i].pipe(calc_beta)
    print(f'Beta for {i}:\t {beta_i:0.2f}')

Beta for AAPL:   1.15
Beta for AMZN:   1.04
Beta for F:  1.22
Beta for META:   1.27
Beta for TSLA:   1.58

Or we can follow out approach above to vectorize this calculation. First, we need to add a market excess return column to returns_2020s_excess.

returns_2020s_excess['Mkt-RF'] = ff['Mkt-RF']
returns_2020s_excess.head()

	AAPL	AMZN	F	META	TSLA	Mkt-RF
Date
2020-01-02	0.0228	0.0271	0.0128	0.0220	0.0285	0.0086
2020-01-03	-0.0098	-0.0122	-0.0224	-0.0054	0.0296	-0.0067
2020-01-06	0.0079	0.0148	-0.0055	0.0188	0.0192	0.0036
2020-01-07	-0.0048	0.0020	0.0098	0.0021	0.0387	-0.0019
2020-01-08	0.0160	-0.0079	-0.0001	0.0101	0.0491	0.0047

vcv = returns_2020s_excess.cov()
vcv

	AAPL	AMZN	F	META	TSLA	Mkt-RF
AAPL	0.0004	0.0003	0.0002	0.0004	0.0005	0.0003
AMZN	0.0003	0.0006	0.0002	0.0004	0.0005	0.0002
F	0.0002	0.0002	0.0009	0.0003	0.0005	0.0003
META	0.0004	0.0004	0.0003	0.0009	0.0005	0.0003
TSLA	0.0005	0.0005	0.0005	0.0005	0.0018	0.0003
Mkt-RF	0.0003	0.0002	0.0003	0.0003	0.0003	0.0002

vcv['Mkt-RF'].div(vcv.loc['Mkt-RF', 'Mkt-RF']).plot(kind='bar')
plt.xlabel('Ticker')
plt.ylabel('CAPM Beta')
plt.title('CAPM Betas')
plt.show()

Note: In a few weeks we will learn the .apply() method, which avoids the loop syntax.

returns_2020s.apply(calc_beta)

AAPL   1.1541
AMZN   1.0429
F      1.2231
META   1.2710
TSLA   1.5780
dtype: float64

3.15 Calculate the Sharpe Ratio for an equally weighted portfolio of these stocks in the 2020s

What do you notice?

returns_2020s.mean(axis=1).pipe(calc_sharpe)

0.9573

Because diversification reduces portfolio standard deviation less than the sum of its parts, the Sharpe Ratio of the equally weighted portfolio is less than the equally weighted mean of the single-stock Sharpe Ratios.

returns_2020s.apply(calc_sharpe).mean()

0.6745

3.16 Calculate the market beta for an equally weighted portfolio of these stocks in the 2020s

What do you notice?

Beta measures nondiversifiable risk, so \(\beta_P = \sum w_i \beta_i\)!

returns_2020s.mean(axis=1).pipe(calc_beta)

1.2538

returns_2020s.apply(calc_beta).mean()

1.2538

3.17 Calculate the market betas for these stocks every calendar year for every possible year

Save these market betas to data frame betas. Our current Python knowledge limits us to a for-loop, but we will learn easier and faster approaches soon!

betas = pd.DataFrame(
    index=range(1972, 2024),
    columns=returns.columns
)

betas.columns.name = 'Ticker'
betas.index.name = 'Year'

betas.tail()

Ticker	AAPL	AMZN	F	META	TSLA
Year
2019	NaN	NaN	NaN	NaN	NaN
2020	NaN	NaN	NaN	NaN	NaN
2021	NaN	NaN	NaN	NaN	NaN
2022	NaN	NaN	NaN	NaN	NaN
2023	NaN	NaN	NaN	NaN	NaN

for i in betas.index: 
    for c in betas.columns:
        betas.at[i, c] = returns.loc[str(i), c].pipe(calc_beta)

betas.tail()

Ticker	AAPL	AMZN	F	META	TSLA
Year
2019	1.4751	1.2752	1.0733	1.2094	1.3262
2020	1.1174	0.6866	1.1052	0.9913	1.3041
2021	1.1957	0.9822	1.2396	1.2014	1.9891
2022	1.2386	1.5922	1.3824	1.6843	1.7414
2023	1.0369	1.4649	1.3947	1.6630	2.2218

3.18 Plot the time series of market betas

betas.plot()
plt.ylabel('CAPM Beta')
plt.title('CAPM Betas')
plt.show()