Python functions

by Charlie Jackson


Posted on March 28, 2019 at 13:00 PM


a function used to ouput a python dictionary

In this tutorial we will look into Python functions in depth. It will start with the basics and gradually get more complex. Topics covered include:

Before diving into creating functions, it's worth thinking about what a function actually is. In essence, it's a code that does a very specific task. So instead of creating a long script that does lots of things, you want to break it up into lots of functions that do specific jobs simply and efficiently .

Basic functions

In their most simplest forms, you can create a function with no parameters.

>>> def hello():
      print('hello')
>>> hello()
'hello'
            

The code above simply prints out 'hello' - you can also 'return' some data and store it in a variable, i.e. so you can use what is returned from the function later on in your program.

>>> def hello():
      return 'hello'

>>> hello = hello()
>>> hello
'hello'
            

Moving on from the simplest function, we can then create a functiona which take parameters and use unique data that is passed to the function.

>>> def hello(name):
      return 'hello ' + name + ' you ledge'

>>> hello('james')
'hello james you ledge'
>>> hello('scott')
'hello scott you ledge'
            

When you create a function you can use as many parameters as you want. In addition you can also return as much data as required.

>>> def hello(fname, lname):
      nice_greeting = 'hello ' + fname + ' ' + lname + ' you ledge'
      unpleasant_greeting = 'hello ' + fname + ' ' + lname + ' you idiot'
      responses = (nice_greeting, unpleasant_greeting) # this creates a tuple named responses
      return responses

>>> nice, unpleasant = hello('james', 'bond')
>>> nice
'hello james bond you ledge'
>>> unpleasant
'hello james bond you idiot'
            

Lastly, you can have default parameters if nothing is passed when the function is called.

>>> def hello(fname, repeat=5):
      name_repeat = fname * repeat
      return name_repeat

>>> hello('james')
'jamesjamesjamesjamesjames'
>>> hello('james', repeat=10)
'jamesjamesjamesjamesjamesjamesjamesjamesjamesjames'
            

Nested functions

Nesting a function means to put one function into another function.

>>> def hello(fname, lname):
  	  def nice(name):
  		  return "alright " + name + " you ledge"
    return nice(fname + " " + lname)

>>> hello('james', 'bond')
'alright james bond you ledge'
            

Scope - LEGB rule

When functions start getting nested and or require data that is not within the function itself, we are going to hit problems with scope. Scope essentially means where a variable can be accessed from. Python will look within each scope in the order: local, global and then built in - abbreviated to LEGB.

  • Local scope -created within a function
  • Enclosing scope -to do with nested functions
  • Global scope - created in the body of a script
  • Built ins - i.e. the functions that are built in with Python like print() len()

Global scope

Where a variable is in the main body of the program it can be accessed from anywhere, including from within functions. In the example below, we can see the variable 'x' is created in the main body of the script and then used within the function 'add()'.

>>> x = 10
>>> def add(number):
    	add_numbers = number + x
    	return add_numbers

>>> add(5)
15
            

In the function above, the program first looked in the local scope (i.e. within the function) and then in the global scope to find the variable 'x'. We see this in the following example where 'x' is located in the local scope.

>>> x = 50
>>> def add(number):
      x = 10
      add_numbers = number + x
      return add_numbers

>>> add(5)
15
>>> x
50
            

Local scope

As noted above, the local scope is where a variable is accessible only from within the function. Calling the variable outside of the function will return a 'NameError'.

>>> def local_variable():
      y = 10
      return y

>>> y + 10
Traceback (most recent call last):
File "pyshell#111", line 1, in module
y + 10
NameError: name 'y' is not defined
            

You can make a local variable global by using the keyword 'global'. This means the value now has global scope.

>>> x = 50
>>> def add():
    	global x
    	x = 10
    	return x

>>> x
50
>>> add()
10
>>> x
10
            

Built-in scope

Built in is the final place Python looks when interpreting data. Typically you will not have any issues with built-in scope but on occasions you may accidentally write a function which overwrites a built-in function. Take the following example where I have named my function 'min' which subsquently means I can't use the builtin function 'min()'.

    >>> lst = [1,2,3,4,5]
>>> min_lst = min(lst)
>>> min_lst
1
>>> def min():
	     pass

>>> min()
>>> min_lst = min(lst)
Traceback (most recent call last):
  File "pyshell#151", line 1, in 
    min_lst = min(lst)
TypeError: min() takes 0 positional arguments but 1 was given
>>>
            

Enclosing scope

Enclosing scope happens when there are nested functions; Python will first look in the local scope, if it cannot find a local variable it will look within the enclosing scope which is the within the function that it is nested. Lets look at the following example:

>>> def outer():
      x = 'outer x'
      def inner():
      	x = 'inner x'
      	print(x)
      inner()
      print(x)


>>> outer()
inner x
outer x
            

Both functions have a variable named "x" with local scope so you see the inner and the outer printed. If we remove the inner x, the enclosed scope will be used and the inner x prints outer x as depicted below.

>>> def outer():
      x = 'outer x'
        def inner():
          print(x)
      inner()
      print(x)

>>> outer()
outer x
outer x
              

Using the keyword 'nonlocal' allows the current scope to be passed to the enclosed scope (but not the global scope) as follows:

>>> def outer():
      x = 'outer x'
        def inner():
        nonlocal x
        x = 'inner x'
        print(x)
      inner()
      print(x)

>>> outer()
inner x
inner x
              

Summary of scope

Remember python will always look for scope within the LEGB sequence (local, enclosed, global, built in). The final example shows you scope across local, enclosed and global.

>>> x = 'global x'
>>> def outer():
    x = 'outer x'
    def inner():
      x = 'inner x'
      print(x)
      inner()
    print(x)

>>> outer()
inner x
outer x
>>> x
'global x'
>>>
              

*args

*args allows you to pass in an unspecified number of arguments to the function. Note, the term 'args' is not a Python keyword but a convention so it can actually be named whatever you like although its best to stick to convention. The asterisk tells python that the function accepts an undefined number of arguments.

>>> def add_numbers(*args):
		total = 0
		for number in args:
			total = total + number
		return total
>>> add_numbers(1,2,3)
6
>>> add_numbers(10,50,6)
66
						

**kargs

**kwargs ,as you might have guest, are very similar to *args except they allow you to pass an undefined number of keyword arguments to the function. kwargs are a dictionary, so can be looped through and accessed using the dictionary syntax.

>>> def football_player(**kwargs):
	for key, value in kwargs.items():
		print(key + " : " + value)

>>> football_player(name='Harry Kane', club="Tottenham Hotspur FC")
name : Harry Kane
club : Tottenham Hotspur FC
>>> football_player(name='Harry Kane', club="Tottenham Hotspur FC", position="Stricker")
name : Harry Kane
club : Tottenham Hotspur FC
position : Stricker
>>>
						

Lambdas

Lambdas allow functions to be written quickly on the fly - for this reason they should be used with caution as a normal function will often be better. To use a lambda function you must specify the 'lambda' keyword followed by the names of the variables in the function and then finally by what you would like to do with the variables and the function to return.

>>> add_numbers = (lambda x, y: x * y)
>>> add_numbers(5,6)
30
>>> add_numbers(10, 5)
50
						

Notice the syntax; first you specify the lambda keyword, then the variables and finally what should be returned.

Anonymous functions (map() and lambda functions)

Lambdas are most useful when they are anonymously embedded within larger expressions using 'map()'. 'map()' always you to apply a function to an object, such as a list. First you specify the keyword 'map' followed by the 'lambda' keyword and syntax (the variables and what should be returned) and lastly the object to which the function should be mapped to.

>>> lst = [1,2,3,4,5]
>>> square = map(lambda x: x*x, lst)
>>> square
map object at 0x03664E50
>>> list(square)
[1, 4, 9, 16, 25]
						

When the expression is validated it returns a map object - this can then be converted to a list using the built-in function 'list()'.

filter()

Using the 'filter()' function allows you to filter out data that does not satisfy a certain criteria. To use 'filter()' you first specify the 'filter' keyword, then the lambda function and lastly the object you would like to map the function to.

>>> lst = [1,2,3,4,5,6,7,8,9,10]
>>> square = filter(lambda x: x > 4, lst)
>>> list(square)
[5, 6, 7, 8, 9, 10]
						

Using functions with Pandas

For this final example we will count all the countries the Tottenham players originate from. First we must create the DataFrame.

>>> import pandas as pd
>>> players = ['Hugo Lloris', 'Kieran Trippier', 'Toby Alderweireld', 'Jan Vertonghen', 'Davinson Sanchez', 'Heung-Min Son', 'Harry Winks', 'Harry Kane', 'Erik Lamela', 'Victor Wanyama', 'Michel Vorm', 'Georges-Kévin Nkoudou', 'Eric Dier', 'Kyle Walker-Peters', 'Moussa Sissoko', 'Fernando Llorente', 'Dele Alli', 'Juan Foyth', 'Paulo Gazzaniga', 'Christian Eriksen', 'Serge Aurier', 'Josh Onomah', 'Lucas Moura', 'Ben Davies', 'Cameron Carter-Vickers', 'Vincent Janssen']
>>> locations = ['Nice, France', 'Bury, England', 'Wilrijk, Belgium', 'Sint-Niklaas, Belgium', 'Caloto, Colombia', 'Chuncheon, South Korea', 'Hemel Hempstead, England', 'London, England', 'Carapachay, Argentina','Nairobi, Kenya', 'Nieuwegein, Netherlands','Versailles, France', 'Cheltenham, England', 'Edmonton, England', 'Le Blanc-Mesnil, France', 'Pamplona, Spain', 'Milton Keynes, England', 'La Plata, Argentina', 'Murphy, Argentina', 'Middelfart, Denmark', 'Ouragahio, Cote dIvoire', 'London, England', 'São Paulo, Brazil', 'Neath, Wales', 'Southend-on-Sea, England', 'Heesch, Netherlands']
>>> data_combined = {'players': players, 'locations': locations}
>>> tottenham_squad = pd.DataFrame(data_combined)
>>> tottenham_squad
                   players                 locations
0              Hugo Lloris              Nice, France
1          Kieran Trippier             Bury, England
2        Toby Alderweireld          Wilrijk, Belgium
3           Jan Vertonghen     Sint-Niklaas, Belgium
4         Davinson Sanchez          Caloto, Colombia
5            Heung-Min Son    Chuncheon, South Korea
6              Harry Winks  Hemel Hempstead, England
7               Harry Kane           London, England
8              Erik Lamela     Carapachay, Argentina
9           Victor Wanyama            Nairobi, Kenya
10             Michel Vorm   Nieuwegein, Netherlands
11   Georges-Kévin Nkoudou        Versailles, France
12               Eric Dier       Cheltenham, England
13      Kyle Walker-Peters         Edmonton, England
14          Moussa Sissoko   Le Blanc-Mesnil, France
15       Fernando Llorente           Pamplona, Spain
16               Dele Alli    Milton Keynes, England
17              Juan Foyth       La Plata, Argentina
18         Paulo Gazzaniga         Murphy, Argentina
19       Christian Eriksen       Middelfart, Denmark
20            Serge Aurier   Ouragahio, Cote dIvoire
21             Josh Onomah           London, England
22             Lucas Moura         São Paulo, Brazil
23              Ben Davies              Neath, Wales
24  Cameron Carter-Vickers  Southend-on-Sea, England
25         Vincent Janssen       Heesch, Netherlands
						

Now the DataFrame is created, we create the function which accepts a DataFrame and has a default parameter for the DataFrame column set to 'locations'. The function takes the column, splits the location by ',' to only get the country from the location and adds to a dictionary to count all the occurrences of each country. Note, I also applied a try and except in case the data is in the wrong format and breaks the code.

>>> def count_countries(df, col_name='locations'):
    """return all the countries and count for the tottenham first team"""
    country_count = {}

    countries = df[col_name]
    for country in countries:
        try:
            country = country.split(',')[1]
            if country in country_count.keys():
                country_count[country] += 1
            else:
                country_count[country] = 1
        except:
            continue

    return(country_count)

>>> count_countries(tottenham_squad)
{' France': 3, ' England': 8, ' Belgium': 2, ' Colombia': 1, ' South Korea': 1, ' Argentina': 3, ' Kenya': 1, ' Netherlands': 2, ' Spain': 1, ' Denmark': 1, ' Cote dIvoire': 1, ' Brazil': 1, ' Wales': 1}
>>>
						

Code takeaway

import pandas as pd

# tottenham squad data
players = ['Hugo Lloris', 'Kieran Trippier', 'Toby Alderweireld', 'Jan Vertonghen', 'Davinson Sanchez', 'Heung-Min Son', 'Harry Winks', 'Harry Kane', 'Erik Lamela', 'Victor Wanyama', 'Michel Vorm', 'Georges-Kévin Nkoudou', 'Eric Dier', 'Kyle Walker-Peters', 'Moussa Sissoko', 'Fernando Llorente', 'Dele Alli', 'Juan Foyth', 'Paulo Gazzaniga', 'Christian Eriksen', 'Serge Aurier', 'Josh Onomah', 'Lucas Moura', 'Ben Davies', 'Cameron Carter-Vickers', 'Vincent Janssen']
locations = ['Nice, France', 'Bury, England', 'Wilrijk, Belgium', 'Sint-Niklaas, Belgium', 'Caloto, Colombia', 'Chuncheon, South Korea', 'Hemel Hempstead, England', 'London, England', 'Carapachay, Argentina','Nairobi, Kenya', 'Nieuwegein, Netherlands','Versailles, France', 'Cheltenham, England', 'Edmonton, England', 'Le Blanc-Mesnil, France', 'Pamplona, Spain', 'Milton Keynes, England', 'La Plata, Argentina', 'Murphy, Argentina', 'Middelfart, Denmark', 'Ouragahio, Cote dIvoire', 'London, England', 'São Paulo, Brazil', 'Neath, Wales', 'Southend-on-Sea, England', 'Heesch, Netherlands']

# combine data within a dictionary
data_combined = {'player': players, 'location': locations}

# create DataFrame
tottenham_squad = pd.DataFrame(data_combined)

def count_countries(df, col_name='location'):
    """return all the countries and count for the tottenham first team"""
    country_count = {}

    countries = df[col_name]
    for country in countries:
        try:
            country = country.split(',')[1]
            if country in country_count.keys():
                country_count[country] += 1
            else:
                country_count[country] = 1
        except:
            continue

    return(country_count)


print(count_countries(tottenham_squad))