Posted on March 28, 2019 at 13:00 PM
In this tutorial we will look into Python functions in depth. It will start with the basics and gradually get more complex. Topics covered include:
Before diving into creating functions, it's worth thinking about what a function actually is. In essence, it's a code that does a very specific task. So instead of creating a long script that does lots of things, you want to break it up into lots of functions that do specific jobs simply and efficiently .
In their most simplest forms, you can create a function with no parameters.
>>> def hello():
print('hello')
>>> hello()
'hello'
The code above simply prints out 'hello' - you can also 'return' some data and store it in a variable, i.e. so you can use what is returned from the function later on in your program.
>>> def hello():
return 'hello'
>>> hello = hello()
>>> hello
'hello'
Moving on from the simplest function, we can then create a functiona which take parameters and use unique data that is passed to the function.
>>> def hello(name):
return 'hello ' + name + ' you ledge'
>>> hello('james')
'hello james you ledge'
>>> hello('scott')
'hello scott you ledge'
When you create a function you can use as many parameters as you want. In addition you can also return as much data as required.
>>> def hello(fname, lname):
nice_greeting = 'hello ' + fname + ' ' + lname + ' you ledge'
unpleasant_greeting = 'hello ' + fname + ' ' + lname + ' you idiot'
responses = (nice_greeting, unpleasant_greeting) # this creates a tuple named responses
return responses
>>> nice, unpleasant = hello('james', 'bond')
>>> nice
'hello james bond you ledge'
>>> unpleasant
'hello james bond you idiot'
Lastly, you can have default parameters if nothing is passed when the function is called.
>>> def hello(fname, repeat=5):
name_repeat = fname * repeat
return name_repeat
>>> hello('james')
'jamesjamesjamesjamesjames'
>>> hello('james', repeat=10)
'jamesjamesjamesjamesjamesjamesjamesjamesjamesjames'
Nesting a function means to put one function into another function.
>>> def hello(fname, lname):
def nice(name):
return "alright " + name + " you ledge"
return nice(fname + " " + lname)
>>> hello('james', 'bond')
'alright james bond you ledge'
When functions start getting nested and or require data that is not within the function itself, we are going to hit problems with scope. Scope essentially means where a variable can be accessed from. Python will look within each scope in the order: local, global and then built in - abbreviated to LEGB.
Where a variable is in the main body of the program it can be accessed from anywhere, including from within functions. In the example below, we can see the variable 'x' is created in the main body of the script and then used within the function 'add()'.
>>> x = 10
>>> def add(number):
add_numbers = number + x
return add_numbers
>>> add(5)
15
In the function above, the program first looked in the local scope (i.e. within the function) and then in the global scope to find the variable 'x'. We see this in the following example where 'x' is located in the local scope.
>>> x = 50
>>> def add(number):
x = 10
add_numbers = number + x
return add_numbers
>>> add(5)
15
>>> x
50
As noted above, the local scope is where a variable is accessible only from within the function. Calling the variable outside of the function will return a 'NameError'.
>>> def local_variable():
y = 10
return y
>>> y + 10
Traceback (most recent call last):
File "pyshell#111", line 1, in module
y + 10
NameError: name 'y' is not defined
You can make a local variable global by using the keyword 'global'. This means the value now has global scope.
>>> x = 50
>>> def add():
global x
x = 10
return x
>>> x
50
>>> add()
10
>>> x
10
Built in is the final place Python looks when interpreting data. Typically you will not have any issues with built-in scope but on occasions you may accidentally write a function which overwrites a built-in function. Take the following example where I have named my function 'min' which subsquently means I can't use the builtin function 'min()'.
>>> lst = [1,2,3,4,5]
>>> min_lst = min(lst)
>>> min_lst
1
>>> def min():
pass
>>> min()
>>> min_lst = min(lst)
Traceback (most recent call last):
File "pyshell#151", line 1, in
min_lst = min(lst)
TypeError: min() takes 0 positional arguments but 1 was given
>>>
Enclosing scope happens when there are nested functions; Python will first look in the local scope, if it cannot find a local variable it will look within the enclosing scope which is the within the function that it is nested. Lets look at the following example:
>>> def outer():
x = 'outer x'
def inner():
x = 'inner x'
print(x)
inner()
print(x)
>>> outer()
inner x
outer x
Both functions have a variable named "x" with local scope so you see the inner and the outer printed. If we remove the inner x, the enclosed scope will be used and the inner x prints outer x as depicted below.
>>> def outer():
x = 'outer x'
def inner():
print(x)
inner()
print(x)
>>> outer()
outer x
outer x
Using the keyword 'nonlocal' allows the current scope to be passed to the enclosed scope (but not the global scope) as follows:
>>> def outer():
x = 'outer x'
def inner():
nonlocal x
x = 'inner x'
print(x)
inner()
print(x)
>>> outer()
inner x
inner x
Remember python will always look for scope within the LEGB sequence (local, enclosed, global, built in). The final example shows you scope across local, enclosed and global.
>>> x = 'global x'
>>> def outer():
x = 'outer x'
def inner():
x = 'inner x'
print(x)
inner()
print(x)
>>> outer()
inner x
outer x
>>> x
'global x'
>>>
*args allows you to pass in an unspecified number of arguments to the function. Note, the term 'args' is not a Python keyword but a convention so it can actually be named whatever you like although its best to stick to convention. The asterisk tells python that the function accepts an undefined number of arguments.
>>> def add_numbers(*args):
total = 0
for number in args:
total = total + number
return total
>>> add_numbers(1,2,3)
6
>>> add_numbers(10,50,6)
66
**kwargs ,as you might have guest, are very similar to *args except they allow you to pass an undefined number of keyword arguments to the function. kwargs are a dictionary, so can be looped through and accessed using the dictionary syntax.
>>> def football_player(**kwargs):
for key, value in kwargs.items():
print(key + " : " + value)
>>> football_player(name='Harry Kane', club="Tottenham Hotspur FC")
name : Harry Kane
club : Tottenham Hotspur FC
>>> football_player(name='Harry Kane', club="Tottenham Hotspur FC", position="Stricker")
name : Harry Kane
club : Tottenham Hotspur FC
position : Stricker
>>>
Lambdas allow functions to be written quickly on the fly - for this reason they should be used with caution as a normal function will often be better. To use a lambda function you must specify the 'lambda' keyword followed by the names of the variables in the function and then finally by what you would like to do with the variables and the function to return.
>>> add_numbers = (lambda x, y: x * y)
>>> add_numbers(5,6)
30
>>> add_numbers(10, 5)
50
Notice the syntax; first you specify the lambda keyword, then the variables and finally what should be returned.
Lambdas are most useful when they are anonymously embedded within larger expressions using 'map()'. 'map()' always you to apply a function to an object, such as a list. First you specify the keyword 'map' followed by the 'lambda' keyword and syntax (the variables and what should be returned) and lastly the object to which the function should be mapped to.
>>> lst = [1,2,3,4,5]
>>> square = map(lambda x: x*x, lst)
>>> square
map object at 0x03664E50
>>> list(square)
[1, 4, 9, 16, 25]
When the expression is validated it returns a map object - this can then be converted to a list using the built-in function 'list()'.
Using the 'filter()' function allows you to filter out data that does not satisfy a certain criteria. To use 'filter()' you first specify the 'filter' keyword, then the lambda function and lastly the object you would like to map the function to.
>>> lst = [1,2,3,4,5,6,7,8,9,10]
>>> square = filter(lambda x: x > 4, lst)
>>> list(square)
[5, 6, 7, 8, 9, 10]
For this final example we will count all the countries the Tottenham players originate from. First we must create the DataFrame.
>>> import pandas as pd
>>> players = ['Hugo Lloris', 'Kieran Trippier', 'Toby Alderweireld', 'Jan Vertonghen', 'Davinson Sanchez', 'Heung-Min Son', 'Harry Winks', 'Harry Kane', 'Erik Lamela', 'Victor Wanyama', 'Michel Vorm', 'Georges-Kévin Nkoudou', 'Eric Dier', 'Kyle Walker-Peters', 'Moussa Sissoko', 'Fernando Llorente', 'Dele Alli', 'Juan Foyth', 'Paulo Gazzaniga', 'Christian Eriksen', 'Serge Aurier', 'Josh Onomah', 'Lucas Moura', 'Ben Davies', 'Cameron Carter-Vickers', 'Vincent Janssen']
>>> locations = ['Nice, France', 'Bury, England', 'Wilrijk, Belgium', 'Sint-Niklaas, Belgium', 'Caloto, Colombia', 'Chuncheon, South Korea', 'Hemel Hempstead, England', 'London, England', 'Carapachay, Argentina','Nairobi, Kenya', 'Nieuwegein, Netherlands','Versailles, France', 'Cheltenham, England', 'Edmonton, England', 'Le Blanc-Mesnil, France', 'Pamplona, Spain', 'Milton Keynes, England', 'La Plata, Argentina', 'Murphy, Argentina', 'Middelfart, Denmark', 'Ouragahio, Cote dIvoire', 'London, England', 'São Paulo, Brazil', 'Neath, Wales', 'Southend-on-Sea, England', 'Heesch, Netherlands']
>>> data_combined = {'players': players, 'locations': locations}
>>> tottenham_squad = pd.DataFrame(data_combined)
>>> tottenham_squad
players locations
0 Hugo Lloris Nice, France
1 Kieran Trippier Bury, England
2 Toby Alderweireld Wilrijk, Belgium
3 Jan Vertonghen Sint-Niklaas, Belgium
4 Davinson Sanchez Caloto, Colombia
5 Heung-Min Son Chuncheon, South Korea
6 Harry Winks Hemel Hempstead, England
7 Harry Kane London, England
8 Erik Lamela Carapachay, Argentina
9 Victor Wanyama Nairobi, Kenya
10 Michel Vorm Nieuwegein, Netherlands
11 Georges-Kévin Nkoudou Versailles, France
12 Eric Dier Cheltenham, England
13 Kyle Walker-Peters Edmonton, England
14 Moussa Sissoko Le Blanc-Mesnil, France
15 Fernando Llorente Pamplona, Spain
16 Dele Alli Milton Keynes, England
17 Juan Foyth La Plata, Argentina
18 Paulo Gazzaniga Murphy, Argentina
19 Christian Eriksen Middelfart, Denmark
20 Serge Aurier Ouragahio, Cote dIvoire
21 Josh Onomah London, England
22 Lucas Moura São Paulo, Brazil
23 Ben Davies Neath, Wales
24 Cameron Carter-Vickers Southend-on-Sea, England
25 Vincent Janssen Heesch, Netherlands
Now the DataFrame is created, we create the function which accepts a DataFrame and has a default parameter for the DataFrame column set to 'locations'. The function takes the column, splits the location by ',' to only get the country from the location and adds to a dictionary to count all the occurrences of each country. Note, I also applied a try and except in case the data is in the wrong format and breaks the code.
>>> def count_countries(df, col_name='locations'):
"""return all the countries and count for the tottenham first team"""
country_count = {}
countries = df[col_name]
for country in countries:
try:
country = country.split(',')[1]
if country in country_count.keys():
country_count[country] += 1
else:
country_count[country] = 1
except:
continue
return(country_count)
>>> count_countries(tottenham_squad)
{' France': 3, ' England': 8, ' Belgium': 2, ' Colombia': 1, ' South Korea': 1, ' Argentina': 3, ' Kenya': 1, ' Netherlands': 2, ' Spain': 1, ' Denmark': 1, ' Cote dIvoire': 1, ' Brazil': 1, ' Wales': 1}
>>>
import pandas as pd
# tottenham squad data
players = ['Hugo Lloris', 'Kieran Trippier', 'Toby Alderweireld', 'Jan Vertonghen', 'Davinson Sanchez', 'Heung-Min Son', 'Harry Winks', 'Harry Kane', 'Erik Lamela', 'Victor Wanyama', 'Michel Vorm', 'Georges-Kévin Nkoudou', 'Eric Dier', 'Kyle Walker-Peters', 'Moussa Sissoko', 'Fernando Llorente', 'Dele Alli', 'Juan Foyth', 'Paulo Gazzaniga', 'Christian Eriksen', 'Serge Aurier', 'Josh Onomah', 'Lucas Moura', 'Ben Davies', 'Cameron Carter-Vickers', 'Vincent Janssen']
locations = ['Nice, France', 'Bury, England', 'Wilrijk, Belgium', 'Sint-Niklaas, Belgium', 'Caloto, Colombia', 'Chuncheon, South Korea', 'Hemel Hempstead, England', 'London, England', 'Carapachay, Argentina','Nairobi, Kenya', 'Nieuwegein, Netherlands','Versailles, France', 'Cheltenham, England', 'Edmonton, England', 'Le Blanc-Mesnil, France', 'Pamplona, Spain', 'Milton Keynes, England', 'La Plata, Argentina', 'Murphy, Argentina', 'Middelfart, Denmark', 'Ouragahio, Cote dIvoire', 'London, England', 'São Paulo, Brazil', 'Neath, Wales', 'Southend-on-Sea, England', 'Heesch, Netherlands']
# combine data within a dictionary
data_combined = {'player': players, 'location': locations}
# create DataFrame
tottenham_squad = pd.DataFrame(data_combined)
def count_countries(df, col_name='location'):
"""return all the countries and count for the tottenham first team"""
country_count = {}
countries = df[col_name]
for country in countries:
try:
country = country.split(',')[1]
if country in country_count.keys():
country_count[country] += 1
else:
country_count[country] = 1
except:
continue
return(country_count)
print(count_countries(tottenham_squad))