# Jupyter Notebooks and Numerical Python#

## Jupyter Notebooks#

Jupyter is an interactive front-end to a rich ecosystem of python packages that support machine learning and data science tasks. We will use the core set of packages shown below for this course:

All course notebooks begin with some preamble code to import and configure some frequently used packages:

```
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
import numpy as np
import pandas as pd
```

You might see a warning about “building the font cache” if this is your first time using the course conda environment, but any other message probably indicates that you do not have your environment set up correctly yet.

Here are the actual package versions we are using:

```
import matplotlib
print('matplotlib', matplotlib.__version__)
print('seaborn', sns.__version__)
print('numpy', np.__version__)
print('pandas', pd.__version__)
```

You will get plenty of experience with notebooks during this course so we will start with just a few basic survival skills. For a deeper dive, start with IPython: Beyond Normal Python in the Python Data Science Handbook.

Type `command?`

to display help on `command`

. Try this now for the built-in `range`

function then dismiss the help window.

Hit the tab key to display possible completions of a partial name. Try this now by typing `np.arc`

below then hitting TAB.

Your notebook will sometimes be unresponsive while executing a long command that you don’t want to wait for. Use the *Interrupt* command in the *Kernel* menu to recover from situations like this. Try this out now by:

Uncomment the

`time.sleep(3600)`

line below, which sleeps for an hour.Execute the cell and note the

`In [*]`

to the left while it is running.Interrupt the kernel.

```
import time
#time.sleep(3600)
```

In more extreme cases you may need to use the *Restart* command, which puts your kernel in a well-defined initial state (good) but also forgets any variables or functions you have already defined (bad). Try this now by:

Re-running the sleep cell above.

Restarting the kernel. Note that this leaves

`[*]`

next to the cell, which is misleading.Comment out the

`time.sleep(3600)`

line above.Rebuild the kernel state using

*Run All Above*from the*Cell*menu. This fixes the`[*]`

.

## Numerical Python#

This section introduces the “numpy” package. For further reading, see Introduction to NumPy in the Python Data Science Handbook.

### Rationale#

The problem that numpy addresses is that python lists are very flexible, but not optimized for the special case where all list elements are numeric values of the same type, which can be very efficiently organized in memory. Numpy provides a specialized array with lots of nice features. One downside of this approach is that most of builtin math functions are duplicated (e.g., `math.sin`

and `np.sin`

) to work with numpy arrays.

In this example, we fill a list with values \(\sin(\pi k / n)\) using plain python and the builtin math package (if the use of spaces in `math.pi * k / n`

looks odd, see PEP8):

```
import math
```

```
def create_python_array(n=100):
x = []
for k in range(n):
x.append(math.sin(math.pi * k / n))
return x
```

Use an ipython magic command to time this function for a large array:

```
%time x1 = create_python_array(1000000)
```

The resulting list is very flexible so, for example, we can replace any value with a completely different type or append new values:

```
x1[3] = 'not a number'
x1.append(-1)
```

The corresponding python code uses numpy functions and does not require any python loop (which is often a bottleneck):

```
def create_numpy_array(n=100):
k = np.arange(n)
x = np.sin(np.pi * k / n)
return x
```

```
%time x2 = create_numpy_array(1000000)
```

The resulting numpy array holds only 64-bit floating point numbers and has a fixed size:

```
try:
x2[2] = 'not a number'
except ValueError as e:
print(e)
```

### Create Arrays#

The most common ways you will create arrays are:

Filled with a regular sequence of values.

Filled with random values.

Calculated as a mathematical function of other arrays.

The exercises below will give you some practice, with pointers to the `numpy functions`

to use (Use the builtin notebook help to learn more about them).

**EXAMPLE:** Use `np.arange`

to create an array of integers 0, 1, …, 9:

```
np.arange(10)
```

**EXAMPLE:** Use `np.linspace`

to create an array of 11 floats 0.0, 0.1, …, 0.9, 1.0:

```
np.linspace(0., 1., 11)
```

**EXAMPLE:** Use `np.random.RandomState`

to initialize a reproducible (pseudo)random number generator using the seed 123:

```
generator = np.random.RandomState(seed=123)
```

**EXAMPLE:** Use `np.random.uniform`

to generate an array of 10 uniformly distributed random numbers in the interval [0,1]:

```
generator.uniform(size=10)
```

**EXERCISE:** Use `np.random.normal`

to generate an array of 10 random numbers drawn from a normal (aka Gaussian) distribution with mean -1 and standard deviation 2:

```
generator.normal(loc=-1, scale=2, size=10)
```

**EXAMPLE:** Generate arrays x, y, z of length 10 with values drawn from a “unit” Gaussian distribution (mean=0, std=1):

```
x = generator.normal(size=10)
y = generator.normal(size=10)
z = generator.normal(size=10)
```

**EXAMPLE:** Adjust the values in x, y, z so that the vectors (x,y,z) are normalized. (This is a useful trick for generating random points that are uniformly distributed on a unit sphere).

```
r = np.sqrt(x ** 2 + y ** 2 + z ** 2)
x /= r
y /= r
z /= r
```

**EXAMPLE:** Calculate arrays `theta`

and `phi`

containing the polar angles **in degrees** for each vector (x, y, z). Use `np.arccos`

, `np.arctan2`

, `np.degrees`

.

```
theta = np.degrees(np.arccos(z))
phi = np.degrees(np.arctan2(y, x))
```

### Multidimensional Arrays#

The examples above all create 1D arrays, but numpy arrays can have multiple dimensions (aka “shapes”):

```
vector = np.ones(shape=(4,))
matrix = np.ones(shape=(3, 4))
tensor = np.ones(shape=(2, 3, 4))
```

Access array elements with zero-based indexing:

```
vector[1] # single element
```

```
matrix[1, :] # single row
```

```
matrix[:, 1] # single column
```

```
tensor[0, :2, :2] # submatrix
```

Use `np.dot`

to calculate inner products:

```
vector.dot(vector)
```

```
matrix.dot(vector)
```

Arrays of the same shape can be combined in “vectorized” operations (i.e., without any explicit loops):

```
matrix + np.sin(matrix ** 2)
```

Arrays of different shapes can also be combined when they are compatible according to the broadcasting rules. Note that the broadcasting rules do not depend on the operations being used so, for example, if `x + y`

is valid, then so is `x * y ** 2 + x ** 2 * y`

.

```
matrix + 0.1
```

```
vector + matrix
```

Define a function to try broadcasting two arbitrary shapes:

```
def broadcast(shape1, shape2):
array1 = np.ones(shape1)
array2 = np.ones(shape2)
try:
array12 = array1 + array2
print('shapes {} {} broadcast to {}'.format(shape1, shape2, array12.shape))
except ValueError as e:
print(e)
```

```
broadcast((1, 3), (3,))
```

```
broadcast((1, 2), (3,))
```

**EXERCISE:** Predict the results of the following:

```
broadcast((3, 1, 2), (3, 2))
broadcast((2, 1, 3), (3, 2))
broadcast((3,), (2, 1))
broadcast((3,), (1, 2))
broadcast((3,), (1, 3))
```

```
broadcast((3, 1, 2), (3, 2))
broadcast((2, 1, 3), (3, 2))
broadcast((3,), (2, 1))
broadcast((3,), (1, 2))
broadcast((3,), (1, 3))
```

Since array values are always “unrolled” into physical memory, arrays with different shapes can look identical in memory. Numpy takes advantage of this with “views” that allow the same memory to be very efficiently interpreted with different shapes:

The `np.newaxis`

method is useful to increase the dimensionality of an array, especially when used in combination with numpy’s broadcast functionality.

```
np.arange(4).shape
```

```
# make it as row vector by inserting an axis along first dimension
np.arange(4)[np.newaxis, :].shape
```

```
# make it as column vector by inserting an axis along second dimension
np.arange(4)[:, np.newaxis].shape
```

**EXAMPLE:** Use `np.reshape`

to create a view of `np.arange(12)`

with shape (2, 1, 6):

```
np.arange(12).reshape(2, 1, 6)
```

**EXAMPLE:** Use `np.newaxis`

to create a view of `np.arange(12)`

with shape (1, 12):

```
np.arange(12)[np.newaxis, :]
```

## Acknowledgements#

Initial version: Mark Neubauer

© Copyright 2023