Writing Functions

Overview

Teaching: 10 min
Exercises: 15 min
Questions
  • How can I create my own functions?

Objectives
  • Explain and identify the difference between function definition and function call.

  • Write a function that takes a small, fixed number of arguments and produces a single result.

Break programs down into functions to make them easier to understand.

Define a function using def with a name, parameters, and a block of code.

def print_greeting():
    print('Hello!')

Defining a function does not run it.

print_greeting()
Hello!

Arguments in call are matched to parameters in definition.

def print_date(year, month, day):
    joined = str(year) + '/' + str(month) + '/' + str(day)
    print(joined)

print_date(1871, 3, 19)
1871/3/19

Functions may return a result to their caller using return.

def average(values):
    if len(values) == 0:
        return None
    return sum(values) / len(values)
a = average([1, 3, 4])
print('average of actual values:', a)
average of actual values: 2.6666666666666665
print('average of empty list:', average([]))
average of empty list: None
result = print_date(1871, 3, 19)
print('result of call is:', result)
1871/3/19
result of call is: None

Identifying Syntax Errors

  1. Read the code below and try to identify what the errors are without running it.
  2. Run the code and read the error message. Is it a SyntaxError or an IndentationError?
  3. Fix the error.
  4. Repeat steps 2 and 3 until you have fixed all the errors.
def another_function
  print("Syntax errors are annoying.")
   print("But at least python tells us about them!")
  print("So they are usually not too hard to fix.")

Solution

def another_function():
  print("Syntax errors are annoying.")
  print("But at least Python tells us about them!")
  print("So they are usually not too hard to fix.")

Definition and Use

What does the following program print?

def report(pressure):
    print('pressure is', pressure)

print('calling', report, 22.5)

Solution

calling <function report at 0x7fd128ff1bf8> 22.5

A function call always needs parenthesis, otherwise you get memory address of the function object. So, if we wanted to call the function named report, and give it the value 22.5 to report on, we could have our function call as follows

print("calling")
report(22.5)

Order of Operations

The example above:

result = print_date(1871, 3, 19)
print('result of call is:', result)

printed:

1871/3/19
result of call is: None

Explain why the two lines of output appeared in the order they did.

Solution

The first line called our previously defined function, print_date(). We passed it the values 1871, 3, and 19. When our function ran, it printend the expected output:

result = print_date(1871, 3, 19)
1871/3/19

The function has no return value so the implicit None was returned and assigned to our newly defined variable, result.

Next, we print our string and the value thati s stored in the variable, result, which is None:

print('result of call is:', result)
result of call is: None

Encapsulation

Fill in the blanks to create a function that takes a single filename as an argument, loads the data in the file named by the argument, and returns the minimum value in that data.

import pandas

def min_in_data(____):
    data = ____
    return ____

Solution

Find the First

Fill in the blanks to create a function that takes a list of numbers as an argument and returns the first negative value in the list. What does your function do if the list is empty?

def first_negative(values):
    for v in ____:
        if ____:
            return ____

Solution

Calling by Name

  1. What does this short program print?
  2. When have you seen a function call like this before?
  3. When and why is it useful to call functions this way?
def print_date(year, month, day):
    joined = str(year) + '/' + str(month) + '/' + str(day)
    print(joined)

print_date(day=1, month=2, year=2003)

Solution

  1. 2003/2/1
    
  2. We used this style of fuction call when we wanted to set the index column while calling pandas.read_csv() in the data frames episode.
    data = pandas.read_csv('data/gapminder_gdp_europe.csv', index_col='country')
    
  3. If a function has optional paramaters, this calling method lets you pass only the options you need.

Encapsulate of If/Print Block

The code below will run on a label-printer for chicken eggs. A digital scale will report a chicken egg mass (in grams) to the computer and then the computer will print a label.

Please re-write the code so that the if-block is folded into a function.

 import random
 for i in range(10):

    # simulating the mass of a chicken egg
    # the (random) mass will be 70 +/- 20 grams
    mass = 70 + 20.0 * (2.0*random. random() - 1.0)

    print(mass)
   
    #egg sizing machinery prints a label
    if(mass >= 85):
        print("jumbo")
    elif(mass >= 70):
        print("large")
    elif(mass < 70 and mass >= 55):
        print("medium")
    else:
        print("small")

The simplified program follows. What function definition will make it functional?

 # revised version
 import random
 for i in range(10):

     # simulating the mass of a chicken egg
     # the (random) mass will be 70 +/- 20 grams
     mass = 70 + 20.0 * (2.0 * random.random() - 1.0)

     print(mass, print_egg_label(mass))

  1. Create a function definition for print_egg_label() that will work with the revised program above.
  2. A dirty egg might have a mass of more than 90 grams, and a spoiled or broken egg will probably have a mass that’s less than 50 grams. Modify your print_egg_label() function to account for these error conditions. Sample output could be 25 too light, probably spoiled.

Solution

  1. def print_egg_label(mass):
        #egg sizing machinery prints a label
        if(mass >= 85):
            return("jumbo")
        elif(mass >= 70):
            return("large")
        elif(mass < 70 and mass >= 55):
            return("medium")
        else:
            return("small")
    
  2. def print_egg_label(mass):
        #egg sizing machinery prints a label
        if(mass > 90):
            return("jumbo but might be dirty")
        elif(mass >= 85):
            return("jumbo")
        elif(mass >= 70):
            return("large")
        elif(mass < 70 and mass >= 55):
            return("medium")
        elif(mass < 55 and mass >= 50):
            return("small")
        else:
            return("small, too light, probably spoiled or broken")
    

Encapsulating Data Analysis

Assume that the following code has been executed:

import pandas

df = pandas.read_csv('gapminder_gdp_asia.csv', index_col=0)
japan = df.ix['Japan']
  1. Complete the statements below to obtain the average GDP for Japan across the years reported for the 1980s.

     year = 1983
     gdp_decade = 'gdpPercap_' + str(year // ____)
     avg = (japan.ix[gdp_decade + ___] + japan.ix[gdp_decade + ___]) / 2
    
  2. Abstract the code above into a single function.

     def avg_gdp_in_decade(country, continent, year):
         df = pd.read_csv('gapminder_gdp_'+___+'.csv',delimiter=',',index_col=0)
         ____
         ____
         ____
         return avg
    
  3. How would you generalize this function if you did not know beforehand which specific years occurred as columns in the data? For instance, what if we also had data from years ending in 1 and 9 for each decade? (Hint: use the columns to filter out the ones that correspond to the decade, instead of enumerating them in the code.)

Solution

  1. year = 1983
    gdp_decade = 'gdpPercap_' + str(year // 10)
    avg = (japan.ix[gdp_decade + '2'] + japan.ix[gdp_decade + '7']) / 2
    
  2. def avg_gdp_in_decade(country, continent, year):
        df = pd.read_csv('gapminder_gdp_' + continent + '.csv', index_col=0)
        c = df.ix[country]
        gdp_decade = 'gdpPercap_' + str(year // 10)
        avg = (c.ix[gdp_decade + '2'] + c.ix[gdp_decade + '7'])/2
        return avg
    
  3. We need to loop over the reported years to obtain the average for the relevant ones in the data.

    def avg_gdp_in_decade(country, continent, year):
        df = pd.read_csv('gapminder_gdp_' + continent + '.csv', index_col=0)
        c = df.ix[country] 
        gdp_decade = 'gdpPercap_' + str(year // 10)
        total = 0.0
        num_years = 0
        for yr_header in c.index: # c's index contains reported years
            if yr_header.startswith(gdp_decade):
                total = total + c.ix[yr_header]
                num_years = num_years + 1
        return total/num_years
    

Key Points

  • Break programs down into functions to make them easier to understand.

  • Define a function using def with a name, parameters, and a block of code.

  • Defining a function does not run it.

  • Arguments in call are matched to parameters in definition.

  • Functions may return a result to their caller using return.