McKinney Chapter 4 - Practice for Section 02

FINA 6333 for Spring 2024

Author

Richard Herron

1 Announcements

Our second DataCamp course, Intermediate Python, is due Friday, 1/26, at 11:59 PM
I will record our week 4 lecture video on McKinney chapter 5 this Thursday evening, and the week 4 pre-class quiz is due before class next Tuesday, 1/30
Team projects
1. Continue to join teams on Canvas > People > Team Projects
2. I removed the join-a-team assignment, but I will give the first project assignment in early February, so join a team by then

2 10-minute Recap

2.1 NumPy Arrays

NumPy arrays are multidimensional data structures that can store numerical data efficiently and perform fast mathematical operations on them.The %precision magic displays floats (including in arrays) to 4 digits.

NumPy arrays are multidimensional data structures that can store numerical data efficiently and perform fast mathematical operations on them.

import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

import numpy as np
%precision 4

'%.4f'

np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

np.ones(10)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

np.ones((2, 2)) # 2 rows and 2 columns

array([[1., 1.],
       [1., 1.]])

np.ones((2, 2, 2, 2)) # 2 rows, 2 columns, 2 stacks, 2 times

array([[[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]],


       [[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]]])

The np.random.rand() function creates standard normal random variables (i.e., mean of 0 and standard deviation of 1). We generally use the np.random.seed() function to make our random numbers repeatable.

np.random.seed(42) # from the *Hitch-Hiker's Guide to the Galaxy*
np.random.randn(2, 2)

array([[ 0.4967, -0.1383],
       [ 0.6477,  1.523 ]])

2.2 Vectorized Functions

Vectorized computation is the process of applying an operation to an entire array or a subset of an array without using explicit loops. NumPy supports vectorized computation using universal functions (ufuncs), which are functions that operate on arrays element-wise.

4**0.5

2.0000

[i**0.5 for i in range(10)]

[0.0000,
 1.0000,
 1.4142,
 1.7321,
 2.0000,
 2.2361,
 2.4495,
 2.6458,
 2.8284,
 3.0000]

np.sqrt(np.arange(10))

array([0.    , 1.    , 1.4142, 1.7321, 2.    , 2.2361, 2.4495, 2.6458,
       2.8284, 3.    ])

2.3 Indexing and Slicing

Indexing and slicing are techniques to access or modify specific elements or subsets of an array. NumPy also supports advanced indexing methods, such as fancy indexing and boolean indexing, which allow more flexible and complex selection of array elements.

np.random.seed(42)
my_array = np.random.randn(3, 3)

my_array

array([[ 0.4967, -0.1383,  0.6477],
       [ 1.523 , -0.2342, -0.2341],
       [ 1.5792,  0.7674, -0.4695]])

How can we get the first row in my_array?

my_array[0]

array([ 0.4967, -0.1383,  0.6477])

How can we get the 0.4967? Chain the [0] indexes.

my_array[0][0]

0.4967

This indexing is common enough, the we can replace [0][0] with [0, 0], which is \(i, j\) notation.

my_array[0, 0]

0.4967

What if we want the first two columns of the first two rows?

my_array[:2, :2]

array([[ 0.4967, -0.1383],
       [ 1.523 , -0.2342]])

3 Practice

3.1 Create a 1-dimensional array named `a1` that counts from 0 to 24 by 1.

a1 = np.arange(25)

a1

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

Here, np.arange() replaces np.array(), list(), and range()!

np.array(list(range(25)))

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

3.2 Create a 1-dimentional array named `a2` that counts from 0 to 24 by 3.

a2 = np.arange(0, 25, 3) # start, stop, step size

a2

array([ 0,  3,  6,  9, 12, 15, 18, 21, 24])

3.3 Create a 1-dimentional array named `a3` that counts from 0 to 100 by multiples of 3 or 5.

a3 = np.array([i for i in range(101) if (i%3==0) or (i%5==0)]) # here we could replace "or" with "|"

a3

array([  0,   3,   5,   6,   9,  10,  12,  15,  18,  20,  21,  24,  25,
        27,  30,  33,  35,  36,  39,  40,  42,  45,  48,  50,  51,  54,
        55,  57,  60,  63,  65,  66,  69,  70,  72,  75,  78,  80,  81,
        84,  85,  87,  90,  93,  95,  96,  99, 100])

We can also use Booleans to slice the output of np.arange(101).

a3_alt = np.arange(101)
a3_alt = a3_alt[ (a3_alt%3==0) | (a3_alt%5==0) ]

a3_alt

array([  0,   3,   5,   6,   9,  10,  12,  15,  18,  20,  21,  24,  25,
        27,  30,  33,  35,  36,  39,  40,  42,  45,  48,  50,  51,  54,
        55,  57,  60,  63,  65,  66,  69,  70,  72,  75,  78,  80,  81,
        84,  85,  87,  90,  93,  95,  96,  99, 100])

(a3 == a3_alt).all()

True

The np.allclose() functions helps us test equality with some non-zero tolerance. More later in the class!

np.allclose(a3, a3_alt)

True

(a3 == 1.000001*a3_alt).all()

False

np.allclose(a3, 1.000001*a3_alt)

True

3.4 Create a 1-dimensional array `a3` that contains the squares of the even integers through 100,000.

How much faster is the NumPy version than the list comprehension version?

np.arange(0, 100_001, 2)**2

array([         0,          4,         16, ..., 1409265424, 1409665412,
       1410065408])

On some computers, the output above is wrong because NumPy defaults to 32-bit integers, depending on the computer! Always check your output! To avoid this problem, we can force np.arange() to use 64-bit integers with the dtype= argument.

np.arange(0, 100_001, 2, dtype=np.int64)**2

array([          0,           4,          16, ...,  9999200016,
        9999600004, 10000000000], dtype=int64)

We can use the %timeit magic to time which code is faster! The %timeit magic runs the code on the same line many times and reports the mean computation time. The %%timet magic with two percent signs runs the code in the same cell many times and reports the mean computation time.

%timeit np.arange(0, 100_001, 2, dtype=np.int64)**2

32.5 µs ± 3.41 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit np.array([i**2 for i in range(0, 100_001)])

13.1 ms ± 700 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The NumPy version is about 1,000 times faster!

3.5 Write a function that mimic Excel’s `pv` function.

Here is how we call Excel’s pv function: =PV(rate, nper, pmt, [fv], [type]) We can use the annuity and lump sum present value formulas.

Present value of an annuity payment pmt: \(PV_{pmt} = \frac{pmt}{rate} \times \left(1 - \frac{1}{(1+rate)^{nper}} \right)\)

Present value of a lump sum fv: \(PV_{fv} = \frac{fv}{(1+rate)^{nper}}\)

def calc_pv(rate, nper, pmt=None, fv=None, type=None):
    if pmt is None:
        pmt = 0
    if fv is None:
        fv = 0
    if type is None:
        type = 'END'
    
    pv_pmt = (pmt / rate) * (1 - 1 / (1 + rate)**nper)
    pv_fv = fv / (1 + rate)**nper
    pv = pv_pmt + pv_fv

    if type == 'BGN':
        pv *= (1 + rate) # same as pv = pv*(1 + rate)
    
    return -1 * pv

calc_pv(rate=0.05, nper=14, pmt=50, fv=1_000, type='END')

-1000.0000

calc_pv(rate=0.05, nper=14, fv=1_000, type='END')

-505.0680

a = 5
a*= 5
a*= 5
a

3.6 Write a function that mimic Excel’s `fv` function.

def calc_fv(rate, nper, pmt=None, pv=None, type=None):
    if pmt is None:
        pmt = 0
    if pv is None:
        pv = 0
    if type is None:
        type = 'END'
    
    fv_pmt = (pmt / rate) * ((1 + rate)**nper - 1)
    fv_pv = pv * (1 + rate)**nper
    fv = fv_pmt + fv_pv

    if type == 'BGN':
        fv *= (1 + rate) # same as pv = pv*(1 + rate)
    
    return -1 * fv

calc_fv(rate=0.072, nper=10, pv=-1000)

2004.2314

calc_fv(rate=0.072, nper=10, pmt=72, pv=-1000)

1000.0000

3.7 Replace the negative values in `data` with -1 and positive values with +1.

np.random.seed(42)
data = np.random.randn(3, 3)
data

array([[ 0.4967, -0.1383,  0.6477],
       [ 1.523 , -0.2342, -0.2341],
       [ 1.5792,  0.7674, -0.4695]])

data[data < 0] = -1
data[data > 0] = +1
data

array([[ 1., -1.,  1.],
       [ 1., -1., -1.],
       [ 1.,  1., -1.]])

Here is a second option! NumPy’s np.where() function has the same logic as Excel’s if() function! I will recreate data so we start from the same point.

np.random.seed(42)
data = np.random.randn(3, 3)
data_where = np.where(data < 0, -1, np.where(data > 0, +1, 0))
data_where

array([[ 1, -1,  1],
       [ 1, -1, -1],
       [ 1,  1, -1]])

Here is a third option! NumPy’s np.select() function lets us test many conditions! I will recreate data so we start from the same point.

np.random.seed(42)
data = np.random.randn(3, 3)
data_select = np.select(
    condlist=[data<0, data>0],
    choicelist=[-1, +1],
    default=0
)
data_select

array([[ 1, -1,  1],
       [ 1, -1, -1],
       [ 1,  1, -1]])

(data_where == data_select).all()

True

Here is a more complex application of np.select() Say, we want to truncate values to be between -0.5 and +0.5.

np.random.seed(42)
data = np.random.randn(3, 3)
data

array([[ 0.4967, -0.1383,  0.6477],
       [ 1.523 , -0.2342, -0.2341],
       [ 1.5792,  0.7674, -0.4695]])

np.select(
    condlist=[data<-0.5, data>+0.5],
    choicelist=[-0.5, +0.5],
    default=data
)

array([[ 0.4967, -0.1383,  0.5   ],
       [ 0.5   , -0.2342, -0.2341],
       [ 0.5   ,  0.5   , -0.4695]])

3.8 Write a function `npmts()` that calculates the number of payments that generate \(x\%\) of the present value of a perpetuity.

Your npmts() should accept arguments c1, r, and g that represent \(C_1\), \(r\), and \(g\). The present value of a growing perpetuity is \(PV = \frac{C_1}{r - g}\), and the present value of a growing annuity is \(PV = \frac{C_1}{r - g}\left[ 1 - \left( \frac{1 + g}{1 + r} \right)^t \right]\).

We can use the growing annuity and perpetuity formulas to show: \(x = \left[ 1 - \left( \frac{1 + g}{1 + r} \right)^t \right]\).

Then: \(1 - x = \left( \frac{1 + g}{1 + r} \right)^t\).

Finally: \(t = \frac{\log(1-x)}{\log\left(\frac{1 + g}{1 + r}\right)}\)

We do not need to accept an argument c1 because \(C_1\) cancels out!

def npmts(x, r, g):
    return np.log(1-x) / np.log((1 + g) / (1 + r))

npmts(0.5, 0.1, 0.05)

14.9000

3.9 Write a function that calculates the internal rate of return given a NumPy array of cash flows.

Here are some data where the \(IRR\) is obvious!

c = np.array([-100, +110])
r = 0.1

First, write a function that calculates net present value (NPV) given cash flows in a NumpPy array c and a discount rate in a scalar r. The npv() function below uses NumPy arrays to calculate NPV as: \[NPV = \sum_{t=0}^T \frac{c_t}{(1+r)^t}\]

def calc_npv(r, c):
    t = np.arange(len(c))
    return (c / (1 + r)**t).sum()

calc_npv(r=r, c=c)

-0.0000

We can use a while loop to guess IRR values until we find an NPV close to zero. We can use the Newton-Rapshon method to make smarter guesses. If we have function \(f(x)\) and guess \(x_t\), our next guess should be \(x_{t+1} = x_t - \frac{f(x_t)}{f'(x_t)}\). Here our \(f(x)\) is \(NPV(r)\), and we can approximate \(f'(x_t)\) as \(\frac{NPV(r+0.000001) - NPV(r)}{0.000001}\). We will make guess until \(|NPV| < 0.000001\).

def calc_irr(c, guess=0, tol=1e-6, step=1e-6):
    irr = guess
    npv = calc_npv(r=irr, c=c) # I made this change after class to possibly save us an iteration
    while np.abs(npv) > tol:
        npv = calc_npv(r=irr, c=c)
        deriv = (calc_npv(r=irr+step, c=c) - npv) / step
        irr = irr - npv / deriv
        # print(f'IRR is {irr}, and NPV is {npv}')
    return irr

calc_irr(c)

0.1000

3.10 Write a function `returns()` that accepts NumPy arrays of prices and dividends and returns a NumPy array of returns.

prices = np.array([100, 150, 100, 50, 100, 150, 100, 150])
dividends = np.array([1, 1, 1, 1, 2, 2, 2, 2])

We want to slice our arrays to “lag” or “shift” them! For example, we slice the prices array to calculate capital gains as follows.

prices[1:] - prices[:-1]

array([ 50, -50, -50,  50,  50, -50,  50])

def returns(p, d):
    return (p[1:] - p[:-1] + d[1:]) / p[:-1]

returns(p=prices, d=dividends)

array([ 0.51  , -0.3267, -0.49  ,  1.04  ,  0.52  , -0.32  ,  0.52  ])

3.11 Rewrite the function `returns()` so it returns NumPy arrays of returns, capital gains yields, and dividend yields.

def returns_2(p, d):
    r = (p[1:] - p[:-1] + d[1:]) / p[:-1]
    dp = d[1:] / p[:-1]
    cgp = r - dp
    return {'r':r, 'dp':dp, 'cgp':cgp}

returns_2(p=prices, d=dividends)

{'r': array([ 0.51  , -0.3267, -0.49  ,  1.04  ,  0.52  , -0.32  ,  0.52  ]),
 'dp': array([0.01  , 0.0067, 0.01  , 0.04  , 0.02  , 0.0133, 0.02  ]),
 'cgp': array([ 0.5   , -0.3333, -0.5   ,  1.    ,  0.5   , -0.3333,  0.5   ])}

3.12 Rescale and shift numbers so that they cover the range [0, 1]

Input: np.array([18.5, 17.0, 18.0, 19.0, 18.0])
Output: np.array([0.75, 0.0, 0.5, 1.0, 0.5])

numbers = np.array([18.5, 17.0, 18.0, 19.0, 18.0])

(numbers - numbers.min()) / (numbers.max() - numbers.min())

array([0.75, 0.  , 0.5 , 1.  , 0.5 ])

3.13 Write functions `var()` and `std()` that calculate variance and standard deviation.

NumPy’s .var() and .std() methods return population statistics (i.e., denominators of \(n\)). The pandas equivalents return sample statistics (denominators of \(n-1\)), which are more appropriate for financial data analysis where we have a sample instead of a population.

Both function should have an argument sample that is True by default so both functions return sample statistics by default.

Use numbers to compare your functions with NumPy’s .var() and .std() methods.

((numbers - numbers.mean())**2).mean()

0.4400

numbers.var()

0.4400

def var(x, sample=True):
    mu_x = x.mean()
    return ((x - mu_x)**2).sum() / (len(x) - sample)

var(numbers)

0.5500

var(numbers) == numbers.var(ddof=1)

True

var(numbers, sample=False)

0.4400

var(numbers, sample=False) == numbers.var()

True

def std(x, sample=True):
    return np.sqrt(var(x=x, sample=sample))

std(numbers, sample=False)

0.6633

std(numbers, sample=False) == numbers.std()

True

std(numbers)

0.7416

std(numbers) == numbers.std(ddof=1)

True

1 Announcements

2 10-minute Recap

2.1 NumPy Arrays

2.2 Vectorized Functions

2.3 Indexing and Slicing

3 Practice

3.1 Create a 1-dimensional array named a1 that counts from 0 to 24 by 1.

3.2 Create a 1-dimentional array named a2 that counts from 0 to 24 by 3.

3.3 Create a 1-dimentional array named a3 that counts from 0 to 100 by multiples of 3 or 5.

3.4 Create a 1-dimensional array a3 that contains the squares of the even integers through 100,000.

3.5 Write a function that mimic Excel’s pv function.

3.6 Write a function that mimic Excel’s fv function.

3.7 Replace the negative values in data with -1 and positive values with +1.

3.8 Write a function npmts() that calculates the number of payments that generate \(x\%\) of the present value of a perpetuity.