015 Further Python

015 Further Python#

COM6018

1. List Comprehensions#

A list comprehension is a compact syntactical construct for performing efficient list processing.

Consider the code below for making a list of the squared values of the numbers 1 to 10

squared_values = []
for x in range(1, 11):
    squared_values.append(x**2)
print(squared_values)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

This can be written more compactly using a list comprehension, as follows,

squared_values = [x**2 for x in range(1, 11)]
print(squared_values)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

There are several advantages to using list comprehensions over the standard for loop approach. The list comprehension version has fewer lines and therefore there are fewer opportunities for the programmer to make an error. It is also more efficient, i.e., it will run a little faster. Finally, although it might seem a little harder to read at first, this is mostly due to unfamiliarity. With experience you will find that its compactness makes it easier to see the overall structure of the code.

List comprehensions can also be used to filter the elements in a list. This is done using an if term. For example, if using a for loop, selecting and squaring values greater than 5 would be written as,

my_list = [1, 7, 4, 10, 12, 3, 4, 9]
new_list = []
for x in my_list:
    if x > 5:
        new_list.append(x)
print(new_list)

[7, 10, 12, 9]

Using a list comprehension, this can be written in a single line,

my_list = [1, 7, 4, 10, 12, 3, 4, 9]
new_list = [x for x in my_list if x > 5]
print(new_list)

[7, 10, 12, 9]

Or for finding the even values in a list,

my_list = [1, 7, 4, 10, 12, 3, 4, 9]
even_values = [x for x in my_list if x % 2 == 0]
print(even_values)

[4, 10, 12, 4]

If we wanted to construct the square of all the numbers greater than 5 we would use,

my_list = [1, 7, 4, 10, 12, 3, 4, 9]
new_list = [x**2 for x in my_list if x > 5]
print(new_list)

[49, 100, 144, 81]

Remember, in Python, a string is a list of characters, so strings can also be processed with list comprehensions

my_string = "The quick brown fox jumps over the lazy dog"
consonants = [x for x in my_string if x not in "aeiou "]
print(consonants)

['T', 'h', 'q', 'c', 'k', 'b', 'r', 'w', 'n', 'f', 'x', 'j', 'm', 'p', 's', 'v', 'r', 't', 'h', 'l', 'z', 'y', 'd', 'g']

my_string = "The quick brown fox jumps over the lazy dog"
upper = [x.upper() for x in my_string]
print(upper)

['T', 'H', 'E', ' ', 'Q', 'U', 'I', 'C', 'K', ' ', 'B', 'R', 'O', 'W', 'N', ' ', 'F', 'O', 'X', ' ', 'J', 'U', 'M', 'P', 'S', ' ', 'O', 'V', 'E', 'R', ' ', 'T', 'H', 'E', ' ', 'L', 'A', 'Z', 'Y', ' ', 'D', 'O', 'G']

List comprehensions can also be nested,

new_list = [(x, y, x * y) for x in range(4) for y in range(4)]
print(new_list)

[(0, 0, 0), (0, 1, 0), (0, 2, 0), (0, 3, 0), (1, 0, 0), (1, 1, 1), (1, 2, 2), (1, 3, 3), (2, 0, 0), (2, 1, 2), (2, 2, 4), (2, 3, 6), (3, 0, 0), (3, 1, 3), (3, 2, 6), (3, 3, 9)]

List comprehensions are not only more compact, they also run a little bit faster than the standard loop based approach, compare

def proc1():
    my_list = []
    for x in range(1000000):
        my_list.append(x * x)


import timeit
print(timeit.Timer(proc1).timeit(number=1))

0.05439621900001157

def proc2():
    my_list = [x * x for x in range(1000000)]


import timeit
print(timeit.Timer(proc2).timeit(number=1))

0.04986810599996261

On my machine the first approach has a run time of 117 ms whereas the list comprehension takes 88 ms, i.e., a decrease of 25%.

In general, if you have a function that generates a list, or which processes a list to generate a new list, always look first to see if it can be written as a list comprehension. For loops with small bodies are often best written as list comprehensions. Sometimes a for loop with a large body can be broken down into a sequence of list comprehensions. However, some judgement is required here. Nested list comprehensions that perform complex operations can be hard to read and should be avoided.

Note, list comprehensions generate lists. There is a similar syntax for generating sets and dictionaries, ie. set and dictionary comprehensions.

# Modulus 3 for numbers 1 to 10
my_list = [x % 3 for x in range(1, 11)]
my_set = {x % 3 for x in range(1, 11)}
my_dictionary = {x: x % 3 for x in range(1, 11)}
print(my_list)
print(my_set)
print(my_dictionary)

[1, 2, 0, 1, 2, 0, 1, 2, 0, 1]
{0, 1, 2}
{1: 1, 2: 2, 3: 0, 4: 1, 5: 2, 6: 0, 7: 1, 8: 2, 9: 0, 10: 1}

2. The `enumerate` function#

enumerate is a builtin generator function that turns a list into a list of tuples such that the first element of each tuple is a list index, i..e., it adds an index to elements of a list. This sounds quite obscure but it is often very useful. It allows us to rewrite some very common for-loop idioms in a more compact form. Examples will make this clear.

For example consider the bit of code below,

index = 0
for x in "ABCDEFG":
    print((index, x))
    index += 1

(0, 'A')
(1, 'B')
(2, 'C')
(3, 'D')
(4, 'E')
(5, 'F')
(6, 'G')

Using enumerate this can be written far more easily as,

for x in enumerate("ABCDEFG"):
    print(x)

(0, 'A')
(1, 'B')
(2, 'C')
(3, 'D')
(4, 'E')
(5, 'F')
(6, 'G')

If you want the index and the list element as separate variables then that is easy too,

# Print 1 'A' followed by 2 'B's followed by 3 'C's etc
for index, x in enumerate("ABCDEFG"):
    print(x * (index + 1))

A
BB
CCC
DDDD
EEEEE
FFFFFF
GGGGGGG

3. Passing functions as parameters to other functions#

Python functions can be stored in variables and passed as parameters to other functions. This allows us to use some very powerful programming techniques.

For example, below we have a function called tabulate which tabulates the values of an arbitrary function. The function to be tabulated is passed as the first parameter. We also written functions called square, cube and times7. We can now pass any of these functions to the tabulate function,

def tabulate(f, a, b):
    """Tabulate the function f between integer values a and b"""
    for i in range(a, b + 1):
        fi = f(i)
        print(
            f"f({i}) = {fi}"
        )  # An 'fstring' - a string which embeds formatted variables


def square(x):
    return x * x


def cube(x):
    return x * x * x


def times7(x):
    return 7 * x

print("cubes from 10 to 20")
tabulate(cube, 10, 20)
print("The seven times table")
tabulate(times7, 1, 12)

cubes from 10 to 20
f(10) = 1000
f(11) = 1331
f(12) = 1728
f(13) = 2197
f(14) = 2744
f(15) = 3375
f(16) = 4096
f(17) = 4913
f(18) = 5832
f(19) = 6859
f(20) = 8000
The seven times table
f(1) = 7
f(2) = 14
f(3) = 21
f(4) = 28
f(5) = 35
f(6) = 42
f(7) = 49
f(8) = 56
f(9) = 63
f(10) = 70
f(11) = 77
f(12) = 84

Note, we could also store the function we wish to tabulate in a variable, say f, and then pass f to the tabulate function,

f = cube
tabulate(f, 10, 20)

f(10) = 1000
f(11) = 1331
f(12) = 1728
f(13) = 2197
f(14) = 2744
f(15) = 3375
f(16) = 4096
f(17) = 4913
f(18) = 5832
f(19) = 6859
f(20) = 8000

Functions behave just like any other type and can therefore be stored in lists, tuples, dictionaries, etc. For example, in the the cell below, the for-loop is iterating over a sequence of functions stored in a tuple,

for f in square, cube, times7:
    print(f)
    tabulate(f, 1, 5)

<function square at 0x7f95784309a0>
f(1) = 1
f(2) = 4
f(3) = 9
f(4) = 16
f(5) = 25
<function cube at 0x7f9578431300>
f(1) = 1
f(2) = 8
f(3) = 27
f(4) = 64
f(5) = 125
<function times7 at 0x7f95784313a0>
f(1) = 7
f(2) = 14
f(3) = 21
f(4) = 28
f(5) = 35

4. Inner functions#

A function can be defined inside another function. These functions are called inner functions. An inner function will not be visible from outside the outer function in which it has been defined. This can be useful for hiding helper functions that are only used inside the outer function and which we don’t want to be used more widely.

For example,

def display_square(x):  # outer function
    def square(x):  # inner function
        return x * x

    xx = square(x)
    print("The square of", x, "is", xx)


display_square(10)

The square of 10 is 100

Beginners often over-used inner functions when they find out about them. The uses cases for them are actually quite specialised and they have the downside of making code harder to read because of the additional nesting. They should be used sparingly.

There is a good discussion of the use cases for inner functions here https://realpython.com/inner-functions-what-are-they-good-for/, which includes the following example,

def factorial(number):

    # Error handling
    if not isinstance(number, int):
        raise TypeError("Sorry. 'number' must be an integer.")
    if not number >= 0:
        raise ValueError("Sorry. 'number' must be zero or positive.")

    def inner_factorial(number):
        if number <= 1:
            return 1
        return number * inner_factorial(number - 1)

    return inner_factorial(number)

Call the outer function

print(factorial(4))

See how the error checking reports an error if the user tries to compute the factorial of a negative integer

print(factorial(-4))

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[23], line 1
----> 1 print(factorial(-4))

Cell In[21], line 7, in factorial(number)
      5     raise TypeError("Sorry. 'number' must be an integer.")
      6 if not number >= 0:
----> 7     raise ValueError("Sorry. 'number' must be zero or positive.")
      9 def inner_factorial(number):
     10     if number <= 1:

ValueError: Sorry. 'number' must be zero or positive.

The advantage of using an inner function here is that the argument checking is performed once in the outer function and can then be safely skipped in the inner function.

Note, inner functions inherit the scope of the outer function, i.e., variables declared in the outer function will be visible to the inner function even if they are not passed as parameters.

def outer(a):
    b = 20

    def inner(c):
        # c has been passed as a parameter, a and b are seen from the outer scope
        print(a, b, c)

    inner(30)


outer(10)

10 20 30

This is very convenient but can also be a source of bugs if you are not careful.

5. Lambda functions#

A lambda function is a function that is defined without a name. They are also called anonymous functions. They are useful for writing short functions that are only used once. They are often used as parameters to other functions.

For example, consider the tabulate function defined above. Say we wanted to tabulate the function \(x^2 + 2 x + 10\). We could do this by first defining a function and then passing that function to `tabulate``, as follows

def my_function(x):
    return x * x + 2 * x + 10

tabulate(my_function, 10, 20)

f(10) = 130
f(11) = 153
f(12) = 178
f(13) = 205
f(14) = 234
f(15) = 265
f(16) = 298
f(17) = 333
f(18) = 370
f(19) = 409
f(20) = 450

This works fine but if my_function is only being used in this one place then it would be more convenient to define it as a lambda function.

The syntax for defining a lambda function is

lambda <parameters>: <expression>

So for our example this would look like,

lambda x: x * x + 2 * x + 10

<function __main__.<lambda>(x)>

We could store this in a variable and then pass it to tabulate,

f = lambda x: x * x + 2 * x + 10
tabulate(f, 10, 20)

f(10) = 130
f(11) = 153
f(12) = 178
f(13) = 205
f(14) = 234
f(15) = 265
f(16) = 298
f(17) = 333
f(18) = 370
f(19) = 409
f(20) = 450

This is a bit like giving the lambda function a name, i.e., f. It is more common to just pass the lambda function directly to the function that is going to use it, as follows,

tabulate(lambda x: x * x + 2 * x + 10, 10, 20)

f(10) = 130
f(11) = 153
f(12) = 178
f(13) = 205
f(14) = 234
f(15) = 265
f(16) = 298
f(17) = 333
f(18) = 370
f(19) = 409
f(20) = 450

Compare this to the original version where we used def to explicitly defined and name the function before passing it to tabulate. The lambda function version is more compact and arguably easier to read. It has the advantage that the function definition is right next to the function call that is using it, which makes it easier to see what is going on. Also, the function does not remain in scope after the call to tabulate has finished, i.e., we can be sure that no-one else will be trying to use it elsewhere in the code. This reduces the burden on code testing and code maintenance, and ultimately reduces the chance of bugs.