Tweet
Share
Share
Last Updated on December 19, 2021
Python is a fantastic programming language. It is likely to be your first
4.7k
By Nick Cotes
Last Updated on December 19, 2021
Python is a fantastic programming language. It is likely to be your first choice for developing a machine learning or data science application. Python is interesting because it is a multi-paradigm programming language that can be used for both object-oriented and imperative programming. It has a simple syntax that is easy to read and comprehend.
In computer science and mathematics, the solution of many problems can be more easily and naturally expressed using the functional programming style. In this tutorial, we’ll discuss Python’s support for the functional programming paradigm, and Python’s classes and modules that help you program in this style.
After completing this tutorial, you will know:
Basic idea of functional programming
The itertools library
The functools library
Map-reduce design pattern and its possible implementation in Python
Let’s get started.
Functional Programming In Python Photo by Abdullah_Shakoor, some rights reserved
Tutorial Overview
This tutorial is divided into 5 parts; they are:
The idea of functional programming
High order functions: Filter, map, and reduce
Itertools
Functools
Map-reduce pattern
The idea of functional programming
If you have programming experience, likely you learned imperative programming. It is built with statements and manipulating variables. Functional programming is a declarative paradigm. It is different from imperative paradigm that programs are built by applying and composing functions. The functions here are supposed to be closer to the definition of a mathematical function, which there are no side effects, or simply, no access to external variables and when you call them with the same argument, they always give you the same result.
The benefit of functional programming is to make your program less error-prone. Without the side effects, it is more predictable and easier to see the outcome. We also no need to worry about one part of the program is interfering another part.
Many libraries adopted a functional programming paradigm. For example the following using pandas and pandas-datareader:
The pandas-datareader is a useful library that helps you download data from the Internet in the realtime. The above example is to download population data from the World Bank. The result is a pandas dataframe with countries and years as index and a single column named “SP.POP.TOTL” for the population. Then we manipulate the dataframe step by step, and at the end, we find the average population of all countries across the years.
We can write in this way because in pandas, most functions on the dataframe are not changing the dataframe, but to produce a new dataframe to reflect the result of the function. We call this behavior immutable because the input dataframe never changed. The consequence is that, we can chain up the functions to manipulate the dataframe step by step. If we have to break it into using the style of imperative programming, the above program is same as the following:
Python is not a strictly functional programming language. But it is trivial to write Python in a functional style. There are three basic functions on iterables that allows us to write powerful program in a very trivial way: filter, map, and reduce.
Filter is to select some of the elements in an iterable, such as a list. Map is to transform elements one by one. Finally, reduce is to convert the entire iterable into a different form, such as the sum of all elements or concatenating substrings in a list into a longer string. To illustrate their use, let’s consider a simple task: Given a log file from the Apache web server, find the IP address that sent most requests with error code 404. If you have no idea what a log file from Apache web server looks like, the following is an example:
The above is from a bigger file located here. These are a few lines from the log. Each line begins with the IP address of the client (i.e., the browser) and the code after “HTTP/1.1” is the response status code. Normally it is 200 if the request is fulfilled. But if the browser requested something not exists in the server, the code will be 404. To find the IP address that corresponds to the most 404 requests, we can simply scan the log file line by line, find those with 404, and count the IP address to find the one with the most occurrences.
In Python code, we can do the following. First we see how we can read the log file and extract the IP address and status code from a line:
then we can use a couple map() and filter() and some other functions to find the IP address:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
...
import collections
def is404(pair):
returnpair[1]=="404"
def getIP(pair):
returnpair[0]
def count_ip(count_item):
ip,count=count_item
return(count,ip)
# transform each line into (IP address, status code) pair
ipcodepairs=map(ip_and_code,lines)
# keep only those with status code 404
pairs404=filter(is404,ipcodepairs)
# extract the IP address part from each pair
ip404=map(getIP,pairs404)
# count the occurrences, the result is a dictionary of IP addresses map to the count
ipcount=collections.Counter(ip404)
# convert the (IP address, count) tuple into (count, IP address) order
countip=map(count_ip,ipcount.items())
# find the tuple with the maximum on the count
print(max(countip))
Here we did not use reduce() function because we have some specialized reduce operation built-in, such as max(). But indeed, we can make a simpler program with list comprehension notation:
...
ipcodepairs=[ip_and_code(x)forxinlines]
ip404=[ip forip,code inipcodepairs ifcode=="404"]
ipcount=collections.Counter(ip404)
countip=[(count,ip)forip,count inipcount.items()]
print(max(countip))
or even write it in a single statement (but less readable):
The above example on filter, map, and reduce illustrates the ubiquity of iterables in Python. This includes lists, tuples, dictionaries, sets, and even generators as all of them can be iterated using a for-loop. In Python, we have a module named itertools that brings in more functions to manipulate (but not mutate) iterables. From Python’s official documentation:
The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination. Together, they form an “iterator algebra” making it possible to construct specialized tools succinctly and efficiently in pure Python.
We’ll discuss a few functions of itertools in this tutorial. When trying out the examples given below, be sure to import itertools and operator as:
import itertools
import operator
Infinite Iterators
Infinite iterators help you create sequences of infinite length as shown below.
Construct + Example
Output
count()
start=0
step=100
foriinitertools.count(start,step):
print(i)
ifi>=1000:
break
1
2
3
4
5
6
7
8
9
10
11
0
100
200
300
400
500
600
700
800
900
1000
cycle()
counter=0
cyclic_list=[1,2,3,4,5]
foriinitertools.cycle(cyclic_list):
print(i)
counter=counter+1
ifcounter>10:
break
1
2
3
4
5
6
7
8
9
10
11
1
2
3
4
5
1
2
3
4
5
1
repeat()
foriinitertools.repeat(3,5):
print(i)
Combinatoric iterators
You can create permutations, combinations, etc. with these iterators.
There are other iterators that stop at the end of the shorter of the two lists passed as arguments. Some of them are described below. This is not an exhaustive list and you can see the complete list here.
Accumulate()
Automatically creates an iterator that accumulates the result of a given operator or function, and returns the result. You can choose an operator from Python’s operator library or write your own customized operator.
In most programming languages, passing function as arguments or a function returning another function might be confusing or hard to work with. Python includes the functools library that makes it easy to work with these functions. From Python’s official functools documentation:
The functools module is for higher-order functions: functions that act on or return other functions. In general, any callable object can be treated as a function
Here we explain a few nice features of this library. You can look at the complete list of functools functions here.
Using lru_cache
In imperative programming languages, recursion is very expensive. Every time a function is invoked, it is evaluated, even if it is called with the same set of arguments. In Python, the lru_cache is a decorator that can be used to cache the results of function evaluations. When the function is invoked again with the same set of arguments, the stored result is used, avoiding the extra overhead related to recursion.
Let’s look at the following example. We have the same implementation of the computation of the nth Fibonacci number with and without lru_cache. We can see that fib(30) has 31 function evaluations just as we expect because of lru_cache. The fib() function is invoked only for n=0,1,2…30 and the result is stored in memory and used later. This is significantly less as compared to fib_slow(30), with 2692537 evaluations.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import functools
@functools.lru_cache
def fib(n):
globalcount
count=count+1
returnfib(n-2)+fib(n-1)ifn>1else1
def fib_slow(n):
globalslow_count
slow_count=slow_count+1
returnfib_slow(n-2)+fib_slow(n-1)ifn>1else1
count=0
slow_count=0
fib(30)
fib_slow(30)
print('With lru_cache total function evaluations: ',count)
print('Without lru_cache total function evaluations: ',slow_count)
With lru_cache total function evaluations: 31
Without lru_cache total function evaluations: 2692537
It is worth to note that the lru_cache decorator is particularly useful when you’re experimenting with machine learning problems in Jupyter notebooks. If you have a function that downloads data from the Internet, wrapping it with lru_cache can keep your download in memory and avoid downloading the same file again even if you invoked the download function multiple times.
Using reduce()
Reduce is similar to the itertools.accumulate(). It applies a function repeatedly to the elements of a list and returns the result. Here are a few examples with comments to explain the working of this functions.
# Evaluates ((1+2)+3)+4
list_sum=functools.reduce(operator.add,[1,2,3,4])
print(list_sum)
# Evaluates (2^3)^4
list_pow=functools.reduce(operator.pow,[2,3,4])
print(list_pow)
The reduce() function can accept any “operators” and optionally an initial value. For example, the collections.Counter function in the previous example can be implemented as follows
1
2
3
4
5
6
7
8
9
10
11
12
13
import functools
def addcount(counter,element):
ifelement notincounter:
counter[element]=1
else:
counter[element]+=1
returncounter
items=["a","b","a","c","d","c","b","a"]
counts=functools.reduce(addcount,items,{})
print(counts)
{'a': 3, 'b': 2, 'c': 2, 'd': 1}
Using partial()
There are situations when you have a function that takes multiple arguments and some of its arguments are repeated again and again. The function partial() returns a new version of the same function with a reduced number of arguments.
For example, if you have to compute the power of 2 repeatedly, you can create a new version of numpy’s power() function as shown below:
import numpy
power_2=functools.partial(np.power,2)
print('2^4 =',power_2(4))
print('2^6 =',power_2(6))
2^4 = 16
2^6 = 64
Map-Reduce Pattern
We have mentioned the filter, map, and reduce functions as high order functions in a previous section. The use of map-reduce design pattern is indeed a way to help us easily make a highly scalable program. The map-reduce pattern is an abstract representation of many types of computations that manipulate lists or collections of objects. The map stage takes the input collection and maps it to an intermediate representation. The reduce step takes this intermediate representation and computes a single output from it. This design pattern is very popular in functional programming languages. Python also provides constructs to implement this design pattern in an efficient manner.
Map-Reduce In Python
As an illustration of the map-reduce design pattern, let’s take a simple example. Suppose we want to count the numbers divisible by 3 in a list. We’ll use lambda to define an anonymous function and use it to map() all items of a list to 1 or 0 depending upon whether they pass our divisibility test or not. The function map() takes as argument a function and an iterable. Next we’ll use reduce() to accumulate the overall result.
The previous example, while being very simple, illustrates how easy it is to implement map-reduce design pattern in Python. You can solve complex and lengthy problems using the surprisingly simple and easy constructs in Python.
Further Reading
This section provides more resources on the topic if you are looking to go deeper.
Books
Python Official Documentation
Summary
In this tutorial, you discovered features of Python that support functional programming
Specifically, you learned:
The iterables returning finite or infinite sequences in Python using itertools
The higher order functions supported by functools
The map-reduce design pattern’s implementation in Python
Do you have any questions about Python discussed in this post? Ask your questions in the comments below and I will do my best to answer.