Posted on March 29, 2019 at 11:00 PM
In this tutorial we look into list comprehensions, dictionary comprehensions and generators in Python.
Comprehensions allow you to write for loops with a single line of code to create an object like a list or dictionary. For example, say we have a list of numbers and we want to create a list of the numbers squared, we could write a for loop and append to a new list as so:
>>> lst = [1,2,3,4,5,6,7,8,10]
>>> squared_lst = []
>>> for no in lst:
squared_no = no * no
squared_list.append(squared_no)
>>> for no in lst:
squared_no = no * no
squared_lst.append(squared_no)
>>> squared_lst
[1, 4, 9, 16, 25, 36, 49, 64, 100]
Using comprehensions, or list comprehension as we are using below, as can write all the above code on a single line meaning its quicker to read and runs faster.
>>> lst = [1,2,3,4,5,6,7,8,10]
>>> squared_lst = [no*no for no in lst]
>>> squared_lst
[1, 4, 9, 16, 25, 36, 49, 64, 100]
The basic premise for list comprehension is to create a list using a single line of code. This is extremely useful when using a conditional statement within the statement - for example taking one list and creating a new list with only the data that is required. The syntax for list comprehension is ['expression output' for iterator in 'object'] as so:
>>> list_of_numbers = [no for no in range(50)]
>>> list_of_numbers
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
As noted above, you can also pass conditional statements into list comprehension as a way to filter data. For example, below we have created a new list which only appends to the list if the number is divisible by 5. Note the syntax is ['expression output' for iterator in 'object' if 'condition'].
>>> numbers = list(range(50))
>>> numbers
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
>>> divisable_by_5 = [no for no in numbers if no % 5 == 0]
>>> divisable_by_5
[0, 5, 10, 15, 20, 25, 30, 35, 40, 45]
To use the if/else condition, the syntax is tweaked to ['expression output' if 'condition' else '' for iterator in 'object' ].
>>> numbers = list(range(50))
>>> numbers
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
>>> divisable_by_5 = [no if no % 5 == 0 else 0 for no in numbers]
>>> divisable_by_5
[0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 10, 0, 0, 0, 0, 15, 0, 0, 0, 0, 20, 0, 0, 0, 0, 25, 0, 0, 0, 0, 30, 0, 0, 0, 0, 35, 0, 0, 0, 0, 40, 0, 0, 0, 0, 45, 0, 0, 0, 0]
Lastly the returned output is flexible, so you can do things like add or subtract, apply a built-in function like 'len()' or even pass a home-made function.
# using addition
>>> lst = [1, 2, 3, 4, 5, 6, 7, 8, 10]
>>> squared_lst = [no*no for no in lst]
>>> squared_lst
[1, 4, 9, 16, 25, 36, 49, 64, 100]
# using a builtin function
>>> names = ['charlie', 'scott', 'david', 'graihagh']
>>> len_of_names= [len(name) for name in names]
>>> len_of_names
[7, 5, 5, 8]
# using a self created function
>>> def append_names(name):
full_name = name + ' jackson'
return full_name
>>> full_name = [append_names(name) for name in names]
>>> full_name
['charlie jackson', 'scott jackson', 'david jackson', 'graihagh jackson']
As with nested for loops, you can also do nested list comprehension - as you are using two loops you typically have two output which you can do multiplication, apply a function or return as a tuple. The syntax is as so [(output1, output2) for iterator in object1 for iterator in object 2]. For example, say we have two 6-sided dice and we want to know the chances(expressed as percentage) of throwing higher than a 10.
>>> possible_throws = [(throw1, throw2) for throw1 in range(1,7) for throw2 in range(1,7) if throw1 + throw2 > 10]
>>> possible_throws
[(5, 6), (6, 5), (6, 6)]
>>> (len(possible_throws) / (6*6)) * 100
8.333333333333332
>>>
Dictionary comprehensions are much like list comprehension (except they return a dictionary), so I will not go into the same amount of detail other than syntax. The main difference is you use a curly braces {} instead of square brackets [] and you must specify the 'key' and 'value'; the syntax being {'key' : 'value' for 'iterator' in 'object'}. Conditional logic follows the same syntax and logic as list comprehension.
>>> names = ['charlie', 'scott', 'david', 'graihagh']
>>> names_dict = {name : len(name) for name in names}
>>> names_dict
{'charlie': 7, 'scott': 5, 'david': 5, 'graihagh': 8}
Generators are closely related to list and dictionary comprehension but the big difference is that a generator expression returns a generator object. Where lists and dictionaries are stored in memory, generator objects are not -they are generated on the fly when they are needed. This is also known as lazy evaluation. The syntax for generator expressions is similar to that of list and dictionaries except this time we use brackets (). (expressions output for iterator in object).
Do not run this code at home. In the example below we can look at creating list comprehension versus a generator. As the list is stored in memory, the list we have created is soo large it will crash a computer. However, with our generator it is being created on the fly so it has no problem with the statement and returns a generator object when called.
# list comprehension - do not run
>>> # large_number = [num for num in range(10 ** 10000)]
# generator expression
>>> large_list = (num for num in range(10 ** 10000))
>>> large_list = (num for num in range(10 ** 10000))
>>> large_list
generator object genexpr at 0x03C193F0
>>> for no in large_list:
if no > 5:
break
print(no)
0
1
2
3
4
5