Our second DataCamp course, Intermediate Python, is due Friday, 1/26, at 11:59 PM
I will record our week 4 lecture video on McKinney chapter 5 this Thursday evening, and the week 4 pre-class quiz is due before class next Tuesday, 1/30
Team projects
Continue to join teams on Canvas > People > Team Projects
I removed the join-a-team assignment, but I will give the first project assignment in early February, so join a team by then
2 10-minute Recap
2.1 NumPy Arrays
NumPy arrays are multidimensional data structures that can store numerical data efficiently and perform fast mathematical operations on them.The %precision magic displays floats (including in arrays) to 4 digits.
NumPy arrays are multidimensional data structures that can store numerical data efficiently and perform fast mathematical operations on them.
import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
The np.random.rand() function creates standard normal random variables (i.e., mean of 0 and standard deviation of 1). We generally use the np.random.seed() function to make our random numbers repeatable.
np.random.seed(42) # from the *Hitch-Hiker's Guide to the Galaxy*np.random.randn(2, 2)
array([[ 0.4967, -0.1383],
[ 0.6477, 1.523 ]])
2.2 Vectorized Functions
Vectorized computation is the process of applying an operation to an entire array or a subset of an array without using explicit loops. NumPy supports vectorized computation using universal functions (ufuncs), which are functions that operate on arrays element-wise.
Indexing and slicing are techniques to access or modify specific elements or subsets of an array. NumPy also supports advanced indexing methods, such as fancy indexing and boolean indexing, which allow more flexible and complex selection of array elements.
On some computers, the output above is wrong because NumPy defaults to 32-bit integers, depending on the computer! Always check your output! To avoid this problem, we can force np.arange() to use 64-bit integers with the dtype= argument.
We can use the %timeit magic to time which code is faster! The %timeit magic runs the code on the same line many times and reports the mean computation time. The %%timet magic with two percent signs runs the code in the same cell many times and reports the mean computation time.
Here is a second option! NumPy’s np.where() function has the same logic as Excel’s if() function! I will recreate data so we start from the same point.
3.8 Write a function npmts() that calculates the number of payments that generate \(x\%\) of the present value of a perpetuity.
Your npmts() should accept arguments c1, r, and g that represent \(C_1\), \(r\), and \(g\). The present value of a growing perpetuity is \(PV = \frac{C_1}{r - g}\), and the present value of a growing annuity is \(PV = \frac{C_1}{r - g}\left[ 1 - \left( \frac{1 + g}{1 + r} \right)^t \right]\).
We can use the growing annuity and perpetuity formulas to show: \(x = \left[ 1 - \left( \frac{1 + g}{1 + r} \right)^t \right]\).
3.9 Write a function that calculates the internal rate of return given a NumPy array of cash flows.
Here are some data where the \(IRR\) is obvious!
c = np.array([-100, +110])r =0.1
First, write a function that calculates net present value (NPV) given cash flows in a NumpPy array c and a discount rate in a scalar r. The npv() function below uses NumPy arrays to calculate NPV as: \[NPV = \sum_{t=0}^T \frac{c_t}{(1+r)^t}\]
def calc_npv(r, c): t = np.arange(len(c))return (c / (1+ r)**t).sum()
calc_npv(r=r, c=c)
-0.0000
We can use a while loop to guess IRR values until we find an NPV close to zero. We can use the Newton-Rapshon method to make smarter guesses. If we have function \(f(x)\) and guess \(x_t\), our next guess should be \(x_{t+1} = x_t - \frac{f(x_t)}{f'(x_t)}\). Here our \(f(x)\) is \(NPV(r)\), and we can approximate \(f'(x_t)\) as \(\frac{NPV(r+0.000001) - NPV(r)}{0.000001}\). We will make guess until \(|NPV| < 0.000001\).
def calc_irr(c, guess=0, tol=1e-6, step=1e-6): irr = guess npv = calc_npv(r=irr, c=c) # I made this change after class to possibly save us an iterationwhile np.abs(npv) > tol: npv = calc_npv(r=irr, c=c) deriv = (calc_npv(r=irr+step, c=c) - npv) / step irr = irr - npv / deriv# print(f'IRR is {irr}, and NPV is {npv}')return irr
calc_irr(c)
0.1000
3.10 Write a function returns() that accepts NumPy arrays of prices and dividends and returns a NumPy array of returns.
3.13 Write functions var() and std() that calculate variance and standard deviation.
NumPy’s .var() and .std() methods return population statistics (i.e., denominators of \(n\)). The pandas equivalents return sample statistics (denominators of \(n-1\)), which are more appropriate for financial data analysis where we have a sample instead of a population.
Both function should have an argument sample that is True by default so both functions return sample statistics by default.
Use numbers to compare your functions with NumPy’s .var() and .std() methods.