XClose

An introduction to research programming with Python

Home
Menu

Comprehensions

The list comprehension

If you write a for loop inside a pair of square brackets for a list, you magic up a list as defined. This can make for concise but hard to read code, so be careful.

In [1]:
result = []
for x in range(10):
    result.append(2 ** x)

print(result)
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

is the same as

In [2]:
[2 ** x for x in range(10)]
Out[2]:
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

You can do quite weird and cool things with comprehensions:

In [3]:
[len(str(2 ** x)) for x in range(20)]
Out[3]:
[1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 6, 6, 6]

You can write an if statement in comprehensions too:

In [4]:
[2 ** x for x in range(30) if x % 3 == 0]
Out[4]:
[1, 8, 64, 512, 4096, 32768, 262144, 2097152, 16777216, 134217728]

Consider the following, and make sure you understand why it works:

In [5]:
"".join([letter for letter in "James Hetherington" if letter.lower() not in "aeiou"])
Out[5]:
'Jms Hthrngtn'

Comprehensions versus building lists with append:

This code:

In [6]:
result = []
for x in range(30):
    if x % 3 == 0:
        result.append(2 ** x)
print(result)
[1, 8, 64, 512, 4096, 32768, 262144, 2097152, 16777216, 134217728]

does the same as the comprehension above. The comprehension is generally considered more readable.

Comprehensions are therefore an example of what we call 'syntactic sugar': they do not increase the capabilities of the language.

Instead, they make it possible to write the same thing in a more readable way.

Everything we learn from now on will be either syntactic sugar or interaction with something other than idealised memory, such as a storage device or the internet. Once you have variables, conditionality, and branching, your language can do anything. (And this can be proved.)

Nested comprehensions

If you write two for statements in a comprehension, you get a single array generated over all the pairs:

In [7]:
[x - y for x in range(4) for y in range(4)]
Out[7]:
[0, -1, -2, -3, 1, 0, -1, -2, 2, 1, 0, -1, 3, 2, 1, 0]

You can select on either, or on some combination:

In [8]:
[x - y for x in range(4) for y in range(4) if x >= y]
Out[8]:
[0, 1, 0, 2, 1, 0, 3, 2, 1, 0]

If you want something more like a matrix, you need to do two nested comprehensions!

In [9]:
[[x - y for x in range(4)] for y in range(4)]
Out[9]:
[[0, 1, 2, 3], [-1, 0, 1, 2], [-2, -1, 0, 1], [-3, -2, -1, 0]]

Note the subtly different square brackets.

Note that the list order for multiple or nested comprehensions can be confusing:

In [10]:
[x + y for x in ["a", "b", "c"] for y in ["1", "2", "3"]]
Out[10]:
['a1', 'a2', 'a3', 'b1', 'b2', 'b3', 'c1', 'c2', 'c3']
In [11]:
[[x + y for x in ["a", "b", "c"]] for y in ["1", "2", "3"]]
Out[11]:
[['a1', 'b1', 'c1'], ['a2', 'b2', 'c2'], ['a3', 'b3', 'c3']]

Dictionary Comprehensions

You can automatically build dictionaries by using a list comprehension syntax, but with curly brackets and a colon:

In [12]:
{((str(x)) * 3): x for x in range(3)}
Out[12]:
{'000': 0, '111': 1, '222': 2}

List-based thinking

Once you start to get comfortable with comprehensions, you find yourself working with containers, nested groups of lists and dictionaries, as the 'things' in your program, not individual variables.

Given a way to analyse some dataset, we'll find ourselves writing stuff like:

analysed_data = [analyse(datum) for datum in data]

analysed_data = map(analyse, data)

There are lots of built-in methods that provide actions on lists as a whole:

In [13]:
any([True, False, True])
Out[13]:
True
In [14]:
all([True, False, True])
Out[14]:
False
In [15]:
max([1, 2, 3])
Out[15]:
3
In [16]:
sum([1, 2, 3])
Out[16]:
6

One common method is map, which works like a simple list comprehension: it applies one function to every member of a list.

In [17]:
[str(x) for x in range(10)]
Out[17]:
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
In [18]:
map(str, range(10))
Out[18]:
<map at 0x7f65c83f08e0>

Its output is this strange-looking map object, which can be iterated over (with a for) or turned into a list:

In [19]:
for element in map(str, range(10)):
    print(element)
0
1
2
3
4
5
6
7
8
9
In [20]:
list(map(str, range(10)))
Out[20]:
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

So I can write:

analysed_data = list(map(analyse, data))

Exercise: Occupancy Dictionary

Take your maze data structure. Write a program to print out a new dictionary, which gives, for each room's name, the number of people in it. Don't add in a zero value in the dictionary for empty rooms.

The output should look similar to:

In [21]:
{"bedroom": 1, "garden": 3, "kitchen": 1, "living": 2}
Out[21]:
{'bedroom': 1, 'garden': 3, 'kitchen': 1, 'living': 2}