A pivot table is a table that displays grouped data from a larger data set, running a function to get a summary for a set of variables in a column. A popular feature in Excel, Python makes it easy to create the same with your dataframes.
This Extra Time tutorial shows how to create pivot tables, with the example of a dataset of matches from a full season. Import your modules, inspect the data and let’s go!
import pandas as pd
import numpy as np
df = pd.read_csv("1617.csv")
len(df)
As you’ll see below, we have a record for each game of the season. The columns contain lots of data points, including goals, cards and fouls.
df.head()
We create a pivot table through pandas’ ‘.pivot_table’ function. It takes two arguments at its most basic:
- The dataframe to pivot – df in this case
- Index – the column that we will pivot around
Let’s pivot around HomeTeam and see what happens:
pd.pivot_table(df,index=["HomeTeam"])
This pivot table gives us the mean for each team in each of their home games – for every variable. Useful, but difficult to read. Let’s choose a couple of columns to look at more closely. We do this by passing a list of columns into the values argument:
pd.pivot_table(df,index=["HomeTeam"], values=["FTHG","FTAG"])
Much easier – we can now spot the teams scoring and conceding the most at home really really easily – great job!
By default, we get the average in our pivots – but we can pass an ‘aggfunc’ argument to change this. Pass one function to apply it to all columns, or a list of functions to return multiple calculations in your pivot table.
Here are some common functions that you might pass:
- np.sum – Add the figures
- len – Give a count of values
- np.std – Find the standard deviation
Let’s look at a couple of examples:
pd.pivot_table(df,index=["HomeTeam"], values=["FTHG","FTAG"], aggfunc=np.sum)
pd.pivot_table(df,index=["HomeTeam"], values=["FTHG","FTHG"], aggfunc=[np.sum,len])
There we go, an intro to pivots in 5 minutes! For further information, read the docs.
Give it a go, and if you’re looking to add more skills to your data analysis toolbox, check out our crash course!