If you write a for loop inside a pair of square brackets for a list, you magic up a list as defined. This can make for concise but hard to read code, so be careful.
[2**x for x in range(10)]
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]
Which equivalent to the following code without using list comprehensions:
result = []
for x in range(10):
result.append(2**x)
print(result)
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]
You can do quite weird and cool things with comprehensions:
[len(str(2**x)) for x in range(20)]
[1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 6, 6, 6]
The example above takes each value from 0 to 19, uses x as the exponential for 2, this is then converted to a string, before finally outputting the number of digits.
You can write an if
statement in comprehensions too:
[2**x for x in range(30) if x%3 ==0 ]
[1, 8, 64, 512, 4096, 32768, 262144, 2097152, 16777216, 134217728]
Consider the following, and make sure you understand why it works:
"".join([letter for letter in "James Hetherington" if letter.lower() not in 'aeiou'])
‘Jms Hthrngtn’
append
:This code:
result=[]
for x in range(30):
if x%3 == 0:
result.append(2**x)
print(result)
[1, 8, 64, 512, 4096, 32768, 262144, 2097152, 16777216, 134217728]
Does the same as the comprehension above. The comprehension is generally considered more readable.
Comprehensions are therefore an example of what we call ‘syntactic sugar’: they do not increase the capabilities of the language.
Instead, they make it possible to write the same thing in a more readable way.
Almost everything we learn from now on will be either syntactic sugar or interaction with something other than idealised memory, such as a storage device or the internet. Once you have variables, conditionality, and branching, your language can do anything. (And this can be proved.)
If you write two for
statements in a comprehension, you get a single array generated over all the pairs:
[x - y for x in range(4) for y in range(4)]
[0, -1, -2, -3, 1, 0, -1, -2, 2, 1, 0, -1, 3, 2, 1, 0]
You can select on either, or on some combination:
[x - y for x in range(4) for y in range(4) if x>=y]
[0, 1, 0, 2, 1, 0, 3, 2, 1, 0]
If you want something more like a matrix, you need to do two nested comprehensions!
[[x - y for x in range(4)] for y in range(4)]
[[0, 1, 2, 3], [-1, 0, 1, 2], [-2, -1, 0, 1], [-3, -2, -1, 0]]
Note the subtly different square brackets.
Note that the list order for multiple or nested comprehensions can be confusing:
[x+y for x in ['a','b','c'] for y in ['1','2','3']]
[‘a1’, ‘a2’, ‘a3’, ‘b1’, ‘b2’, ‘b3’, ‘c1’, ‘c2’, ‘c3’]
[[x+y for x in ['a','b','c']] for y in ['1','2','3']]
[[‘a1’, ‘b1’, ‘c1’], [‘a2’, ‘b2’, ‘c2’], [‘a3’, ‘b3’, ‘c3’]]
Hint: look very carefully at the positioning of []
You can automatically build dictionaries, by using a list comprehension syntax, but with curly brackets and a colon:
{ (str(x))*3: x for x in range(3) }
{‘000’: 0, ‘111’: 1, ‘222’: 2}
Once you start to get comfortable with comprehensions, you find yourself working with containers, nested groups of lists and dictionaries, as the ‘things’ in your program, not individual variables.
Given a way to analyse some dataset, we’ll find ourselves writing stuff like:
'analysed_data = [analyze(datum) for datum in data]'
There are lots of built-in methods that provide actions on lists as a whole:
any([True, False, True])
True
all([True, False, True])
False
max([1, 2, 3])
3
sum([1, 2, 3])
6
One common method is map
, which is syntactic sugar for a simple list comprehension that applies one function to every member of a list.
To access the individual items of ‘map’ we use ‘list’.
[str(x) for x in range(10)]
[‘0’, ‘1’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6’, ‘7’, ‘8’, ‘9’]
list(map(str, range(10)))
[‘0’, ‘1’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6’, ‘7’, ‘8’, ‘9’]
So I can write:
'analysed_data = map(analyse, data)'