030 Numerical Computing with NumPy#
COM6018
Copyright © 2023–2025 Jon Barker, University of Sheffield. All rights reserved.
1. Introducing NumPy#
1.1 What is NumPy?#
NumPy is a core Python package for scientific computing that
provides a powerful N-dimensional array object,
provides highly optimised linear algebra tools,
has tight integration with C/C++ and Fortran code,
is licensed under a BSD license, i.e., it is freely reusable.
1.2 Importing NumPy#
NumPy is conventionally imported using:
import numpy as np
The Python keyword as
allows us to use np
as a shorthand to refer to the numpy
module in our code. This is a common convention and so your code will be more readable if you follow it.
2. NumPy Arrays#
2.1 Generating a NumPy ndarray#
The NumPy package introduces a type called numpy.ndarray
that is used for representing N-dimensional arrays. An N-dimensional array is a data structure that can be used to represent vectors, matrices, and higher-dimensional arrays. For example, a vector is a 1-D array, a matrix is a 2-D array, and so on.
Arrays can be generated from various sources. These include
from Python lists containing numeric data,
using NumPy array generating functions, or
by reading data from a file.
2.2 Generating arrays from lists#
In the example below, we generate a 1-D array from a Python list. We start with the Python list my_list
and then use the NumPy function np.array
to convert it to a NumPy array which we have stored as the variable my_array
.
my_list = [1, 2, 3, 4, 5]
my_array = np.array(my_list) # create a simple 1-D array
print(my_list)
print(my_array)
print(type(my_list))
print(type(my_array))
[1, 2, 3, 4, 5]
[1 2 3 4 5]
<class 'list'>
<class 'numpy.ndarray'>
Note the slight difference in appearance when we print a NumPy array versus printing a Python list.
We will now construct a 2-D array (i.e., a matrix) from a list of lists,
my_2d_array = np.array([[1., 2, 3], [4, 5, 6], [7, 8, 9]])
print(my_2d_array)
[[1. 2. 3.]
[4. 5. 6.]
[7. 8. 9.]]
print(type(my_2d_array))
<class 'numpy.ndarray'>
2.3 The ndarray object’s properties#
The ndarray
has various properties that we can access. The most important are shape
, size
, and dtype
.
The shape
property is a tuple that gives the size of each dimension of the array.
print(my_2d_array.shape)
(3, 3)
The size
property gives the total number of elements in the array.
print(my_2d_array.size)
9
The dtype
property gives the data type of the array’s elements (e.g., are they integers, floats, etc.)
print(my_2d_array.dtype)
float64
Tip: when working with NumPy arrays, it is often useful to check the shape
and dtype
properties to make sure you are working with the correct type of array. Printing these properties is also a useful debugging technique and is often more useful than printing the array’s contents.
2.4 N-dimensional arrays#
NumPy generalises arrays to be N-dimensional. Although we are often working with 1-D or 2-D data there are applications where higher dimensions are useful. For example, 3-D data appears often when processing video (x, y, and time) and 4-D data appears in medical imaging (x, y, z and time). In mathematics, N-dimensional arrays are often called tensors. (This is why Google’s deep learning library is called TensorFlow.)
In the example below, we first create a 2-D array (x2
) of shape 2 by 2 and then we make a 3-D array (x3
) using a list of four of these 2-D arrays.
x2 = np.array([[1, 2], [3, 4]]) # a matrix
x3 = np.array([x2, x2, x2, x2]) # stacking four matrices
print(x3.shape)
(4, 2, 2)
The resulting array has shape (4, 2, 2), corresponding to four 2×2 matrices stacked along a new dimension.
We can then repeat this process to create a 4-D array (x4
) by stacking five copies of the 3-D array to make a 4-D structure…
x2 = np.array([[1, 2], [3, 4]]) # a matrix
x3 = np.array([x2, x2, x2, x2]) # stacking four matrices
x4 = np.array([x3, x3, x3, x3, x3]) # stacking 5 3-D structures
print(x4.shape)
(5, 4, 2, 2)
The array x4
has shape (5, 4, 2, 2), corresponding to five 3-D arrays stacked along a new dimension.
Then we can stack these 4-D arrays to make 5-D arrays.
x2 = np.array([[1, 2], [3, 4]]) # a matrix
x3 = np.array([x2, x2, x2, x2]) # stacking four matrices
x4 = np.array([x3, x3, x3, x3, x3]) # stacking 5 3-D structures
x5 = np.array([x4, x4]) # stacking 2 4-D structures!
print(x5.shape)
(2, 5, 4, 2, 2)
The array x5
has shape (2, 5, 4, 2, 2), corresponding to two 4-D arrays stacked along a new dimension.
But in COM6018 we will mostly only use N=1 (vectors) and N=2 (matrices).
3 Generating NumPy arrays#
3.1 Basic array generating functions#
NumPy provides a number of functions for generating arrays of various kinds.
For example, generating a 1-D array of consecutive integers,
x = np.arange(10)
print(x)
[0 1 2 3 4 5 6 7 8 9]
Or an array of evenly spaced numbers,
x = np.arange(100, 110, 2) # start, stop, step
print(x)
[100 102 104 106 108]
The linspace
function is similar to arange
but allows you to specify the number of points rather than the step size,
x = np.linspace(10, 20, 5) # start, stop, n-points
print(x)
[10. 12.5 15. 17.5 20. ]
We often need arrays full of zeros or ones. NumPy provides functions for this,
x = np.zeros( (3, 3, 3) ) # Note, argument is a tuple
print(x)
[[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]]
Note that the zeros()
function has a single argument specifying the desired array shape. This argument is a tuple. For example, in the above, the argument has value (3,3,3) which means that we want a 3-D array with 3 elements in each dimension. I have placed spaces between the parentheses to make it clear that this is a tuple. However, the spaces are not necessary and you will see this more often written as np.zeros((3,3,3))
. (When you see it written like this, do not be confused into thinking that the double brackets are redundant. You cannot rewrite this as np.zeros(3,3,3)
. This is computer programming, not mathematics. :smile: )
Similarly, to make an array full of ones we can use,
x = np.ones((2, 5))
print(x)
[[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]]
3.2 More array generating functions#
The diag
function can be used to generate diagonal matrices where we specify the numbers that we want to appear on the leading diagonal. For example, a 3 x 3 matrix with 4, 5 and 3 along the diagonal can be generated with,
x = np.diag((4, 5, 3))
print(x)
[[4 0 0]
[0 5 0]
[0 0 3]]
There is an optional parameter k
that allows us to instead specify a diagonal that is displaced from the leading diagonal. This is most easily explained with an example,
x = np.diag((2, 2), k=3)
print(x)
[[0 0 0 2 0]
[0 0 0 0 2]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]]
By summing matrices of this form we can make any banded-diagonal matrix, e.g.,
x = np.diag((1, 1, 1)) + np.diag((2, 2), k=1) + np.diag((2, 2), k=-1)
print(x)
[[1 2 0]
[2 1 2]
[0 2 1]]
Using diag
we could make an identity matrix, which has 1’s along the leading diagonal. However, because the identity matrix is used so often, NumPy provides a dedicated function for generating it, eye
, which has a single parameter that determines the number of rows and columns,
x = np.eye(6)
print(x)
[[1. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0.]
[0. 0. 0. 1. 0. 0.]
[0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 1.]]
3.3 Arrays initialised with random numbers#
NumPy provides a number of functions for generating arrays of random numbers. The most useful are rand
and randn
which appear in the submodule numpy.random
. rand
uses random numbers that are uniformly distributed between 0 and 1. randn
uses random numbers that are normally distributed with mean 0 and standard deviation 1.
For example, to generate a 2 by 4 matrix of random numbers,
np.random.rand(2, 4) # uniform distribution between 0 and 1
array([[0.57577207, 0.35606156, 0.42857438, 0.46082787],
[0.68367578, 0.55961875, 0.2127753 , 0.04693685]])
or,
np.random.randn(2, 4) # standard normal distribution
array([[-0.80463969, 0.433091 , -0.66090184, -1.8915296 ],
[-0.0403259 , -2.33267021, 1.06055345, -0.6664819 ]])
Modern note: the preferred modern interface is via np.random.default_rng()
:
rng = np.random.default_rng()
rng.standard_normal((2, 4)) # same as np.random.randn(2, 4)
array([[-1.68497697, -0.05638719, 0.05909045, -0.01831279],
[ 1.46499569, 0.22413766, -0.77604908, 0.22800631]])
3.4 Reading arrays from files#
Finally, we might want to generate arrays by reading data from a file. NumPy provides a number of functions for this,
genfromtxt
andsavetxt
for reading and writing to text files.load
andsave
for reading and writing in NumPy’s native format.
For the example below, we will use data from a text file data/liver_data_20.txt
which has 20 rows of data stored in CSV format,
cat data/liver_data_20.txt
cat: data/liver_data_20.txt: No such file or directory
To read this into a NumPy array we simply use,
x = np.genfromtxt("data/liver_data_20.txt", delimiter=",") # for reading a csv file
print(x)
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[25], line 1
----> 1 x = np.genfromtxt("data/liver_data_20.txt", delimiter=",") # for reading a csv file
2 print(x)
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py:1991, in genfromtxt(fname, dtype, comments, delimiter, skip_header, skip_footer, converters, missing_values, filling_values, usecols, names, excludelist, deletechars, replace_space, autostrip, case_sensitive, defaultfmt, unpack, usemask, loose, invalid_raise, max_rows, encoding, ndmin, like)
1989 fname = os.fspath(fname)
1990 if isinstance(fname, str):
-> 1991 fid = np.lib._datasource.open(fname, 'rt', encoding=encoding)
1992 fid_ctx = contextlib.closing(fid)
1993 else:
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_datasource.py:192, in open(path, mode, destpath, encoding, newline)
155 """
156 Open `path` with `mode` and return the file object.
157
(...) 188
189 """
191 ds = DataSource(destpath)
--> 192 return ds.open(path, mode, encoding=encoding, newline=newline)
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_datasource.py:529, in DataSource.open(self, path, mode, encoding, newline)
526 return _file_openers[ext](found, mode=mode,
527 encoding=encoding, newline=newline)
528 else:
--> 529 raise FileNotFoundError(f"{path} not found.")
FileNotFoundError: data/liver_data_20.txt not found.
The delimiter
parameter specifies that the data are separated by commas. (The default is to assume that the data are separated by spaces.)
Saving a NumPy array to a file is also easy,
x = np.genfromtxt("data/liver_data_20.txt", delimiter=",") # for reading a csv file
np.savetxt("data/matrix.tsv", x, delimiter="\t", fmt="%.5f")
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[26], line 1
----> 1 x = np.genfromtxt("data/liver_data_20.txt", delimiter=",") # for reading a csv file
2 np.savetxt("data/matrix.tsv", x, delimiter="\t", fmt="%.5f")
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py:1991, in genfromtxt(fname, dtype, comments, delimiter, skip_header, skip_footer, converters, missing_values, filling_values, usecols, names, excludelist, deletechars, replace_space, autostrip, case_sensitive, defaultfmt, unpack, usemask, loose, invalid_raise, max_rows, encoding, ndmin, like)
1989 fname = os.fspath(fname)
1990 if isinstance(fname, str):
-> 1991 fid = np.lib._datasource.open(fname, 'rt', encoding=encoding)
1992 fid_ctx = contextlib.closing(fid)
1993 else:
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_datasource.py:192, in open(path, mode, destpath, encoding, newline)
155 """
156 Open `path` with `mode` and return the file object.
157
(...) 188
189 """
191 ds = DataSource(destpath)
--> 192 return ds.open(path, mode, encoding=encoding, newline=newline)
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_datasource.py:529, in DataSource.open(self, path, mode, encoding, newline)
526 return _file_openers[ext](found, mode=mode,
527 encoding=encoding, newline=newline)
528 else:
--> 529 raise FileNotFoundError(f"{path} not found.")
FileNotFoundError: data/liver_data_20.txt not found.
Here we have saved the data in tab-separated format with 5 decimal places of precision.
cat data/matrix.tsv
cat: data/matrix.tsv: No such file or directory
4 Array manipulation#
4.1 Indexing and slicing#
Indexing is similar to Python lists.
x = np.array([1, 2, 3, 4, 5, 6, 7])
print(x[0])
print(x[2:5])
print(x[:4])
print(x[4:])
1
[3 4 5]
[1 2 3 4]
[5 6 7]
But it is generalised to n-dimensions.
x = np.random.rand(5, 5)
print(x[2:4, :2])
[[0.68039442 0.54998793]
[0.33444001 0.7828657 ]]
4.2 Extracting a row or column vector from a matrix#
A = np.genfromtxt("data/test_matrix.txt")
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[30], line 1
----> 1 A = np.genfromtxt("data/test_matrix.txt")
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py:1991, in genfromtxt(fname, dtype, comments, delimiter, skip_header, skip_footer, converters, missing_values, filling_values, usecols, names, excludelist, deletechars, replace_space, autostrip, case_sensitive, defaultfmt, unpack, usemask, loose, invalid_raise, max_rows, encoding, ndmin, like)
1989 fname = os.fspath(fname)
1990 if isinstance(fname, str):
-> 1991 fid = np.lib._datasource.open(fname, 'rt', encoding=encoding)
1992 fid_ctx = contextlib.closing(fid)
1993 else:
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_datasource.py:192, in open(path, mode, destpath, encoding, newline)
155 """
156 Open `path` with `mode` and return the file object.
157
(...) 188
189 """
191 ds = DataSource(destpath)
--> 192 return ds.open(path, mode, encoding=encoding, newline=newline)
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_datasource.py:529, in DataSource.open(self, path, mode, encoding, newline)
526 return _file_openers[ext](found, mode=mode,
527 encoding=encoding, newline=newline)
528 else:
--> 529 raise FileNotFoundError(f"{path} not found.")
FileNotFoundError: data/test_matrix.txt not found.
print(A)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[31], line 1
----> 1 print(A)
NameError: name 'A' is not defined
print(A[2, 1:4]) # extract elements 1..3 of row 2
print(A[2]) # extract the whole of row 2
print(A[2, 1:4].shape)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[32], line 1
----> 1 print(A[2, 1:4]) # extract elements 1..3 of row 2
2 print(A[2]) # extract the whole of row 2
3 print(A[2, 1:4].shape)
NameError: name 'A' is not defined
print(A[:, 2]) # extract column 2
print(A[:, 2].shape)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[33], line 1
----> 1 print(A[:, 2]) # extract column 2
2 print(A[:, 2].shape)
NameError: name 'A' is not defined
4.3 Some basic operations#
The NumPy ndarray object has many methods.
e.g., min
, max
, sum
, product
, mean
x = np.array([1, 2, 3, 4, 5, 6])
print(x.min(), x.max())
1 6
print(x.sum(), x.prod())
21 720
print(x.mean(), x.var())
3.5 2.9166666666666665
These operations can be applied to arrays with more than one dimension.
A = np.genfromtxt("data/test_matrix.txt")
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[38], line 1
----> 1 A = np.genfromtxt("data/test_matrix.txt")
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_npyio_impl.py:1991, in genfromtxt(fname, dtype, comments, delimiter, skip_header, skip_footer, converters, missing_values, filling_values, usecols, names, excludelist, deletechars, replace_space, autostrip, case_sensitive, defaultfmt, unpack, usemask, loose, invalid_raise, max_rows, encoding, ndmin, like)
1989 fname = os.fspath(fname)
1990 if isinstance(fname, str):
-> 1991 fid = np.lib._datasource.open(fname, 'rt', encoding=encoding)
1992 fid_ctx = contextlib.closing(fid)
1993 else:
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_datasource.py:192, in open(path, mode, destpath, encoding, newline)
155 """
156 Open `path` with `mode` and return the file object.
157
(...) 188
189 """
191 ds = DataSource(destpath)
--> 192 return ds.open(path, mode, encoding=encoding, newline=newline)
File /opt/hostedtoolcache/Python/3.11.13/x64/lib/python3.11/site-packages/numpy/lib/_datasource.py:529, in DataSource.open(self, path, mode, encoding, newline)
526 return _file_openers[ext](found, mode=mode,
527 encoding=encoding, newline=newline)
528 else:
--> 529 raise FileNotFoundError(f"{path} not found.")
FileNotFoundError: data/test_matrix.txt not found.
print(A)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[39], line 1
----> 1 print(A)
NameError: name 'A' is not defined
print(A.min(), A.max())
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[40], line 1
----> 1 print(A.min(), A.max())
NameError: name 'A' is not defined
mean_values = A.mean(axis=0)
sum_values = A.sum(axis=0)
print(mean_values)
print(sum_values)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[41], line 1
----> 1 mean_values = A.mean(axis=0)
2 sum_values = A.sum(axis=0)
3 print(mean_values)
NameError: name 'A' is not defined
5 Working with NumPy arrays#
5.1 Reshaping and resizing#
It is sometimes necessary to wrap a vector into a matrix, or unwrap a matrix into a vector.
M = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]).reshape(3, 3)
print(M)
[[1 2 3]
[4 5 6]
[7 8 9]]
v = M.reshape(9)
print(v)
[1 2 3 4 5 6 7 8 9]
# The following line will generate an error
# because reshape cannot change the number of elements.
v = M.reshape(8)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[45], line 4
1 # The following line will generate an error
2 # because reshape cannot change the number of elements.
----> 4 v = M.reshape(8)
ValueError: cannot reshape array of size 9 into shape (8,)
5.2 Adding a new dimension#
You can easily turn 1-D vectors into 2-D matrices.
v = np.array([1, 2, 3, 4, 5])
print(v)
print(v.shape)
[1 2 3 4 5]
(5,)
v_row = v[np.newaxis, :] # turn a vector into a 1-row matrix
print(v_row)
print(v_row.shape)
[[1 2 3 4 5]]
(1, 5)
v_col = v[:, np.newaxis] # turn a vector into a 1-column matrix
print(v_col)
print(v_col.shape)
[[1]
[2]
[3]
[4]
[5]]
(5, 1)
5.3 Stacking arrays#
Arrays with compatible dimensions can be joined horizontally or vertically
x = np.ones((2, 3))
y = np.zeros((2, 2))
z = np.hstack((x, y, x)) # note, arrays passed as a tuple
print(x)
print(y)
print(z)
print(z.shape)
[[1. 1. 1.]
[1. 1. 1.]]
[[0. 0.]
[0. 0.]]
[[1. 1. 1. 0. 0. 1. 1. 1.]
[1. 1. 1. 0. 0. 1. 1. 1.]]
(2, 8)
x = np.ones((2, 2))
y = np.zeros((1, 2))
z = np.vstack((x, y))
print(z)
print(z.shape)
[[1. 1.]
[1. 1.]
[0. 0.]]
(3, 2)
5.4 Tiling and repeating#
x = np.array([[1, 2], [3, 4]])
y = np.tile(x, 3)
print(y)
[[1 2 1 2 1 2]
[3 4 3 4 3 4]]
y = np.tile(x, (2, 4))
print(y)
[[1 2 1 2 1 2 1 2]
[3 4 3 4 3 4 3 4]
[1 2 1 2 1 2 1 2]
[3 4 3 4 3 4 3 4]]
y = np.repeat(x, 4)
print(y)
[1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4]
y = np.repeat(x, 4, axis=0)
print(y)
[[1 2]
[1 2]
[1 2]
[1 2]
[3 4]
[3 4]
[3 4]
[3 4]]
6 Copying#
6.1 Shallow copy#
Arrays are handled by reference.
When you do A = B
you are just copying a reference, not the data itself.
A = np.array([1, 2, 3, 4, 5, 6])
B = A
B[0] = 10
print(A)
[10 2 3 4 5 6]
Note that this is also true for Python lists and objects.
A = [1, 2, 3, 4, 5, 6]
B = A
B[0] = 10
print(A)
[10, 2, 3, 4, 5, 6]
So, how do we make a real copy?
6.2 Deep Copy#
To actually copy the data stored in the array, we use the NumPy copy method.
A = np.array([1, 2, 3, 4, 5, 6])
B = A.copy() # can also write, B = np.copy(A)
B[0] = 10
print(A)
[1 2 3 4 5 6]
Note, to copy Python lists, we first need to import the copy
module.
import copy
A = [1, 2, 3, 4, 5, 6]
B = copy.deepcopy(A)
print(B)
[1, 2, 3, 4, 5, 6]
(Don’t confuse NumPy ndarrays and Python lists…)
7 Matrix operations#
NumPy implements all common array operations,
addition, subtraction,
transpose,
multiplication,
inverse
7.1 Array addition and subtraction#
X = np.array([[1, 2, 3], [4, 5, 6]])
Y = np.ones((2, 3))
print(X)
print(Y)
[[1 2 3]
[4 5 6]]
[[1. 1. 1.]
[1. 1. 1.]]
print(X + Y)
[[2. 3. 4.]
[5. 6. 7.]]
Z = X - 2 * Y # note, scalar multiplication
print(Z)
[[-1. 0. 1.]
[ 2. 3. 4.]]
Z = X + np.array([[2, 2], [2, 2]])
print(Z)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[65], line 1
----> 1 Z = X + np.array([[2, 2], [2, 2]])
2 print(Z)
ValueError: operands could not be broadcast together with shapes (2,3) (2,2)
7.2 Broadcasting#
During operations, NumPy will try to repeat an array to make dimensions fit. This is called ‘broadcasting’. It is convenient but it can be confusing.
X = np.array([[1, 2, 3], [4, 5, 6]])
row = np.array([1, 1, 1])
print(X)
print(row)
[[1 2 3]
[4 5 6]]
[1 1 1]
print(X + row)
[[2 3 4]
[5 6 7]]
col = np.array([1, 2])
print(col)
print(X + col[:, np.newaxis])
[1 2]
[[2 3 4]
[6 7 8]]
7.3 Transpose#
Transposing a matrix swaps the rows and columns.
A = np.array([[1, 2, 3], [4, 5, 6]])
print(A)
print(A.T)
[[1 2 3]
[4 5 6]]
[[1 4]
[2 5]
[3 6]]
A = np.array([[1, 2, 3], [4, 5, 6]])
print(A.shape)
print(A.T.shape)
(2, 3)
(3, 2)
v = np.array([1, 2, 3, 4, 5])
print(v)
print(v.T) # Vectors only have one dimension. Transpose does nothing.
[1 2 3 4 5]
[1 2 3 4 5]
v = np.array([1, 2, 3, 4, 5])
print(v.shape)
print(v.T.shape)
(5,)
(5,)
7.4 A note on vectors versus “skinny” matrices#
When using NumPy, a vector is not the same as a matrix with one column.
v = np.array([1, 2, 3, 4, 5]) # A vector - has 1 dimension
print(v)
print(v.T)
print(v.shape)
[1 2 3 4 5]
[1 2 3 4 5]
(5,)
M_row = np.array([[1, 2, 3, 4, 5]]) # A matrix - has 2 dimensions
print(M_row)
print(M_row.shape)
[[1 2 3 4 5]]
(1, 5)
M_col = np.array([[1, 2, 3, 4, 5]]).T # A matrix can be transposed
print(M_col)
print(M_col.shape)
[[1]
[2]
[3]
[4]
[5]]
(5, 1)
7.5 Multiplication#
The *
operator performs ‘elementwise’ multiplication.
A = np.array([[1, 2, 3], [4, 5, 6]])
X = A * A
print(X)
[[ 1 4 9]
[16 25 36]]
Standard ‘matrix multiplication’ is performed using the dot
function.
X = np.dot(A, A.T) # Multiply 2x3 matrix A and 3x2 matrix A.T (AA')
print(X)
[[14 32]
[32 77]]
or equivalently using the @
operator,
X = A @ A.T # Modern matrix multiplication operator
print(X)
[[14 32]
[32 77]]
v = np.array([1, 2, 3])
x = A @ v # Multiply 2x3 matrix A and 3-element vector (Av)
print(x)
[14 32]
7.6 Matrix inverse#
The matrix determinant and the inverse function are provided by the linalg
submodule of NumPy.
A = np.array([[2, 1], [3, 2]])
print(A)
[[2 1]
[3 2]]
det_A = np.linalg.det(A)
print(det_A)
0.9999999999999998
inv_A = np.linalg.inv(A)
print(inv_A)
[[ 2. -1.]
[-3. 2.]]
8 Advanced Indexing and Performance#
8.1 Boolean indexing and masking#
NumPy arrays can be indexed using Boolean conditions. This allows you to filter or modify data without writing loops.
x = np.arange(10)
mask = x % 2 == 0 # create a Boolean mask for even numbers
even_numbers = x[mask] # Use mask to extract even numbers
print(x)
print(mask)
print(even_numbers)
[0 1 2 3 4 5 6 7 8 9]
[ True False True False True False True False True False]
[0 2 4 6 8]
Boolean masks can also be used for assignment:
# If the value is less than 5 then set it to -1
x[x < 5] = -1
print(x)
[-1 -1 -1 -1 -1 5 6 7 8 9]
8.2 Fancy indexing#
NumPy also allows arrays or lists of indices to be used to select elements.
x = np.arange(10, 20)
indices = [0, 2, 5]
print(x[indices])
[10 12 15]
You can also pass arrays of row and column indices for multidimensional arrays:
A = np.arange(12).reshape(3, 4)
print(A)
rows = [0, 2]
cols = [1, 3]
print(A[rows, cols]) # selects A[0,1] and A[2,3]
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[ 1 11]
8.3 Vectorisation vs loops#
One of the main reasons to use NumPy is speed. Operations on whole arrays (vectorised code) are much faster than looping in Python.
import timeit
x = np.random.rand(1000000)
# Vectorised sum
print(timeit.timeit("np.sum(x)", globals=globals(), number=10))
# Python loop sum
def py_sum(arr):
s = 0
for v in arr:
s += v
return s
print(timeit.timeit("py_sum(x)", globals=globals(), number=10))
0.002135269000007156
0.7491466349999882
You should see that the NumPy version is orders of magnitude faster than the pure Python loop.
9 Summary#
NumPy provides tools for numeric computing.
With NumPy, Python becomes a usable alternative to MATLAB.
NumPy’s basic type is the
ndarray
– it can represent vectors, matrices, etc.Lots of tools for vector and matrix manipulation.
This lecture has only reviewed the most commonly used.
For the full documentation, see https://numpy.org/doc/stable/
Copyright © 2023–2025 Jon Barker, University of Sheffield. All rights reserved.