matplotlib

Joyplots in Python with Joypy

Joyplots are a way for us to show lots of density plots in one chart, while also adding a category that we can differentiate by. They are quite fashionable at present and have allowed for some beautiful graphics. Python’s joypy library, building on matplotlib, gives us the opportunity to create our very own joyplots in just a few lines of code. In this article, we’ll give a tutorial into creating the plots and customising them by plotting the top 50 transfer values of each year since 1991. Hopefully we’ll get a small insight into the trends of the biggest moves in the modern game.

Let’s get our modules in place and take a look at our dataset:

In [1]:
from __future__ import unicode_literals
import joypy
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import cm
%matplotlib inline

df = pd.read_csv("top50.csv")

df.head()
Out[1]:
Player Value Year
0 David Platt 7 1991
1 Robert Prosinecki 7 1991
2 Thomas Häßler 6 1991
3 Jürgen Kohler 6 1991
4 Thomas Doll 6 1991

Our dataset has 1350 observations, each containing one of the most expensive 50 transfers for the years 1991 through 2017.

When we create a joyplot, we should specify what category we want to differentiate. In this case, we will categorise by the ‘Year’ column, and we use the ‘by’ argument in ‘joypy.joyplot()’ to do this.

We should also tell it which column we want the density plots to draw. Here, it will obviously be the ‘value’ column – passed on through the ‘column’ argument.

Let’s take a look at the default:

In [2]:
fig, axes = joypy.joyplot(df, by="Year", column="Value",figsize=(5,8))
plt.show()

It’s a start! Certainly not the best-looking plot, but we’ve got something.

As you can see, each plot-line is a year, with the graphic showing the values. The plot appears needlessly wide due to Neymar’s transfer forcing us to draw beyond 200m€. Is there anything in football oil money won’t affect?!

What can we learn from the chart? Obviously, players have gotten more expensive as time goes on – you certainly don’t need a chart to tell you that. But we can also see that the variation between values in the top 50 of each year has become much more spread out – we even see some years where the trend hasn’t been growth.

I don’t doubt that there are better ways to plot this, but hopefully we can make something fairly good-looking to justify it.

Let’s through in some customisation:

In [9]:
fig, axes = joypy.joyplot(df[df.Player != 'Neymar'], by="Year", column="Value",figsize=(5,8),
             linewidth=0.05,overlap=3,colormap=cm.summer_r,x_range=[0,110])

plt.text(40, 0.8, "Top 50 transfer values (€m) \n 1991-2017",fontsize=12)

plt.show()

In this plot, we have used a subset of our dataset – excluding Neymar, to try and get a better handle of the rest of the shapes in the chart.

Let’s run through the other changes we’ve made:

  • We’ve set a very thin line-width, allowing us to see the odd outlying transfer fee – notice Ronaldo in 2009, Zidane in 2001 or Sheare rin 1996.
  • The plots overlap and are much more condensed. This makes for a smaller plot, and the overlapping plots are a bit more interesting to look at.
  • A colourmap is applied, changing the colour for each year. As we have overlapped the plots, we need to set a colour difference to tell our years apart.
  • We’ve set custom limits to the x axis, stopping the overspill from negative numbers

There are still some changes that we would like to make. We should really add axis titles, annotate some interesting transfers and so on – but this is a great start and illustrates how we can make joyplots quickly and easily.

It would be remiss of any Joyplot tutorial not to plot the Unknown Pleasures cover that inspires the plot name. Borrowing code from the docs, and using the data above, let’s give it a go:

In [4]:
fig, axes = joypy.joyplot(df[df.Player != 'Neymar'],by="Year", column="Value", ylabels=False, xlabels=False, 
                          grid=False, fill=False, background='k', linecolor="w", linewidth=1, x_range=[-60,110],
                          legend=False, overlap=0.5, figsize=(6,5),kind="counts", bins=80)

Interestingly, we can see outliers a lot easier with this style! I quite like the aesthetics, but it is maybe not as eye-catching as a data visualisation compared to the previous plot.

Summary

This article has taken you through the steps of creating and editing your first joyplots. These overlapping density plots can make really beautiful charts that show a lot of information in a novel way. These charts use time to differentiate each row on the y axis, but you may also want to find a way to plot time along the x axis to show changes over time, rather than changes in density.

Next up, take a look at plotting heatmaps for correlation, or our other visualisation articles.

Posted by FCPythonADMIN in Visualisation

Making Better Python Visualisations

FC Python recently received a tweet from @fitbawnumbers applying our lollipop chart code to Pep’s win percentage. It was great to see this application of the chart, and especially interesting because Philip then followed up with another chart showing the same data from Excel. To be blunt, the Excel chart was much cleaner/better than our lollipop charts – Philip had done a great job with it.

This has inspired us to put together a post exploring some of matplotlib’s customisation options and principles that underpin them. Hopefully this will give us a better looking and more engaging chart!

As a reminder, this is the chart that we are dealing with improving, and you can find the tutorial for lollipop charts here.

Step One – Remove everything that adds nothing

There is clearly lots that we can improve on here. Let’s start with the basics – if you can remove something without damaging your message, remove it. We have lots of ugly lines here, let’s remove the box needlessly around our data, along with those ticks. Likewise the axes labels, we know that the y axis shows teams – so let’s bin that too. We’ll do this with the following code:

In [ ]:
#For every side of the box, set to invisible

for side in ['right','left','top','bottom']:
    ax.spines[side].set_visible(False)
    
#Remove the ticks on the x and y axes

for tic in ax.xaxis.get_major_ticks():
    tic.tick1On = tic.tick2On = False

for tic in ax.yaxis.get_major_ticks():
    tic.tick1On = tic.tick2On = False

Step Two – Where appropriate, change the defaults

Philip’s Excel chart looked great because it didn’t look like an Excel chart. He had changed all of the defaults: the colours, the font, the label location. Subsequently, it doesn’t look like the charts that have bored us to death in presentations for decades. So let’s change our title locations and fonts to make it look like we’ve put some effort in beyond the defaults. Code below:

In [ ]:
#Change font
plt.rcParams["font.family"] = "DejaVu Sans"

#Instead of use plt.title, we'll use plt.text to fully customise it
#First two arguments are x/y location
plt.text(55, 19,"Premier League 16/17", size=18, fontweight="normal")

Step Three – Add labels if they are clean and give detail

While the lollipop chart makes it easy to understand the differences between teams, our orignal chart requires users to look all the way down if they want the value. Even then, the audience has to make a rough estimation. Why not add values to make everything a bit cleaner?

We can easily iterate through our values in the dataframe and plot them alongside the charts. The code below uses ‘enumerate()’ to count through each of the values in the points column of our table. For each value, it writes text at location v,i (nudged a bit with the sums below). Take a look at the for loop:

In [ ]:
for i, v in enumerate(table['Pts']):
    ax.text(v+2, i+0.8, str(v), color=teamColours[i], size = 13)

Step Four – Improve aesthetics with strong colour against off-white background

Our lollipop sticks are very, very thin. We can improve the look of these by giving them a decent thickness and a block of bold colour. Underneath this colour, we should add an off-white colour. This differentiates the plot from the rest of the page, and makes it look a lot more professional. Next time you see a great plot, take note of the base colour and try to understand the effect that this has on the plot and article as a whole!

Our code for doing these two things is below:

In [ ]:
#Set a linewidth in our hlines argument
plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'],color=teamColours,linewidths=10)

#Set a background colour to the data area background and the plot as a whole
ax.set_facecolor('#f7f4f4')
fig.patch.set_facecolor('#f7f4f4')

Fitting it all together

Putting all of these steps together, we get something like the following. Follow along with the comments and see what fits in where:

In [1]:
#Set our plot and desired size
fig = plt.figure(figsize=(10,7))
ax = plt.subplot()

#Change our font
plt.rcParams["font.family"] = "DejaVu Sans"

#Each value is the hex code for the team's colours, in order of our chart
teamColours = ['#034694','#001C58','#5CBFEB','#D00027',
              '#EF0107','#DA020E','#274488','#ED1A3B',
               '#000000','#091453','#60223B','#0053A0',
               '#E03A3E','#1B458F','#000000','#53162f',
               '#FBEE23','#EF6610','#C92520','#BA1F1A']

#Plot our thicker lines and team names
plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'],color=teamColours,linewidths=10)
plt.yticks(np.arange(1,21), table['Team'])

#Label our axes as needed and title the plot
plt.xlabel("Points")
plt.text(55, 19,"Premier League 16/17", size=18, fontweight="normal")

#Add the background colour
ax.set_facecolor('#f7f4f4')
fig.patch.set_facecolor('#f7f4f4')

for side in ['right','left','top','bottom']:
    ax.spines[side].set_visible(False)

for tic in ax.xaxis.get_major_ticks():
    tic.tick1On = tic.tick2On = False

for tic in ax.yaxis.get_major_ticks():
    tic.tick1On = tic.tick2On = False
    
for i, v in enumerate(table['Pts']):
    ax.text(v+2, i+0.8, str(v), color=teamColours[i], size = 13)

plt.show()

Without doubt, this is a much better looking chart than the lollipop. Not only does it look better, but it gives us more information and communicates better than our former effort. Thank you Philip for the inspiration!

Summary

This article has looked at a few ways to tidy our charts. The rules that we introduced throughout should be applied to any visualisation that you’re looking to communicate with. Ensure that your charts are as clean as possible, are labelled and stray away from defaults. Follow these, and you’ll be well on your way to creating great plots!

Why not apply these rules to some of the other basic examples in our visualisation series and let us know how you improve on our articles!

Posted by FCPythonADMIN in Blog, Visualisation

Python Treemaps with Squarify & Matplotlib

Treemaps are visualisations that split the area of our chart to display the value of our datapoints. At their simplest, they display shapes in sizes appropriate to their value, so bigger rectangles represent higher values. Python allows us to create these charts quite easily, as it will calculate the size of each rectangle for us and plot it in a way that fits. In addition to this, we can combine our treemap with the matplotlib library’s ability to scale colours against variables to make good looking and easy to understand plots with Python.

Let’s fire up our libraries (make sure you install squarify!) and take a look at our data:

In [1]:
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import squarify
In [2]:
data = pd.read_csv("Data/ManCityATT.csv")
data.head()
Out[2]:
Player Pos GP GS MP G A SOG S YC RC
0 Agüero, Sergio F 19 17 1518 16 5 32 77 1 0
1 Sterling, Raheem M 22 18 1668 14 4 23 55 2 1
2 Gabriel Jesus F 18 12 1016 8 2 22 35 3 0
3 Sané, Leroy M 22 17 1548 7 10 14 36 4 0
4 De Bruyne, Kevin M 24 24 2060 6 10 28 61 1 0

Our dataframe has a record for each player in Manchester City’s squad, with their games/minutes played alongside goals, assists, shots and card data.

Facing Manchester City next week, we would like to visualise where their threat comes from and who they rely on for goals and assists. We’ll do this with our treemap!

We will create our treemap in a few key steps:

  1. Create a new dataframe that contains only players that have scored.
  2. Utilise matplotlib to create a colour map that assigns each player a colour according to how many goals they have scored.
  3. Set up a new, rectangular plot for our heatmap
  4. Plot our data & title
  5. Show the plot, with no axes

The commented code below will show you exactly how you can do this:

In [3]:
# New dataframe, containing only players with more than 0 goals.
dataGoals = data[data["G"]>0]

#Utilise matplotlib to scale our goal numbers between the min and max, then assign this scale to our values.
norm = matplotlib.colors.Normalize(vmin=min(dataGoals.G), vmax=max(dataGoals.G))
colors = [matplotlib.cm.Blues(norm(value)) for value in dataGoals.G]

#Create our plot and resize it.
fig = plt.gcf()
ax = fig.add_subplot()
fig.set_size_inches(16, 4.5)

#Use squarify to plot our data, label it and add colours. We add an alpha layer to ensure black labels show through
squarify.plot(label=dataGoals.Player,sizes=dataGoals.G, color = colors, alpha=.6)
plt.title("Man City Goals",fontsize=23,fontweight="bold")

#Remove our axes and display the plot
plt.axis('off')
plt.show()

Let’s take another look, this time creating a treemap for assists. See if you can understand what we are doing without the comments above!

In [4]:
dataAssists = data[data["A"]>0]

norm = matplotlib.colors.Normalize(vmin=min(dataAssists.A), vmax=max(dataAssists.A))
colors = [matplotlib.cm.Blues(norm(value)) for value in dataAssists.A]

fig = matplotlib.pyplot.gcf()
fig.set_size_inches(16, 4.5)

fig = plt.gcf()
fig.set_size_inches(16, 4.5)

squarify.plot(label=dataAssists.Player,sizes=dataAssists.A, color = colors, alpha=.6)
plt.title("Man City Assists",fontsize=23,fontweight="bold")

plt.axis('off')
plt.show()

Summary

Awesome, we now have a couple of simple charts that show the dangerous players in City’s lineups, and the code to reproduce these for other teams. This should make for a quick, easy and impactful addition for your pre-match reports!

Next up, why not learn more about Python visualisations, like violin plots or lollipop charts?

Posted by FCPythonADMIN in Visualisation

Creating Personal Football Heatmaps in Python

Tracking technology has been a part of football analysis for the past 20 years, giving access to data on physical performance and heat map visualisations that show how far and wide a player covers. As this technology becomes cheaper and more accessible, it has now become easy for anyone to get this data on their Sunday morning games. This article runs through how you can create your own heatmaps for a game, with nothing more than a GPS tracking device (running watch, phone, gps unit) and Python.

To get your hands on your own data, you can extract your gpx file through Strava. While Strava is great for runs, it isn’t built for football or running in tight spaces. So let’s build our own!

Let’s import our necessary modules and data, then get started!

In [1]:
#GPXPY makes using .gpx files really easy
import gpxpy

#Visualisation libraries
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

#Opens our .gpx file, then parses it into a format that is easy for us to run through
gpx_file = open('5aside.gpx', 'r')
gpx = gpxpy.parse(gpx_file)

The .gpx file type, put simply, is a markup file that records the time and your location on each line. With location and time, we can calculate distance between locations and, subsequently, speed. We can also visualise this data, as we’ll show here.

Let’s take a look at what one of these lines looks like:

In [2]:
gpx.tracks[0].segments[0].points[0]
Out[2]:
GPXTrackPoint(51.5505, -0.3048, elevation=44, time=datetime.datetime(2018, 1, 19, 12, 14, 26))

The first two points are our latitude and longitude, alongside elevation and time. This gives us a lot of freedom to calculate variables and plot our data, and is the foundation of a lot of the advanced metrics that you will find on Strava.

In our example, we want to plot our latitude and longitude, so let’s use a for loop to add these to a list:

In [3]:
lat = []
lon = []

for track in gpx.tracks:
    for segment in track.segments:
        for point in segment.points:
            lat.append(point.latitude)
            lon.append(point.longitude)

Our location is now extraceted into a handy x and y format….let’s plot it. We’ve borrowed Andy Kee‘s Strava plotting aesthetic here, take a read of his article for more information on plotting your cycle/run data!

In [4]:
fig = plt.figure(facecolor = '0.1')
ax = plt.Axes(fig, [0., 0., 1., 1.], )
ax.set_aspect('equal')
ax.set_axis_off()
fig.add_axes(ax)
plt.plot(lon, lat, color = 'deepskyblue', lw = 0.3, alpha = 0.9)
plt.show()

The lines are great, and make for a beautiful plot, but let’s try and create a Prozone-esque heatmap on our pitch.

To do this, we can plot on the actual pitch that we played on, using the gmplot module. GM stands for Google Maps, and will import its functionality for our plot. Let’s take a look at how this works:

In [5]:
#Import the module first
import gmplot

#Start an instance of our map, with three arguments: lat/lon centre point of map - in this case,
#We'll use the first location in our data. The last argument is the default zoom level of the map
gmap = gmplot.GoogleMapPlotter(lat[0], lon[0], 20)

#Create our heatmap using our lat/lon lists for x and y coordinates
gmap.heatmap(lat, lon)

#Draw our map and save it to the html file named in the argument
gmap.draw("Player1.html")

This code will spit out a html file, that we can then open to get our heatmap plotted on a Google Maps background. Something like the below:

 Football heatmap created in Python

Summary

Similar visualisations of professional football matches set clubs and leagues back a pretty penny, and you can do this with entirely free software and increasingly affordable kit. While this won’t improve FC Python’s exceedingly poor on-pitch performances, we definitely think it is pretty cool!

Simply export your gpx data from Strava and extract the lat/long data, before plotting it as a line or as a heatmap on a map background for some really engaging visualisation.

Next up, learn about plotting this on a pitchmap, rather than satellite imagery.

Posted by FCPythonADMIN in Blog

Drawing a Pass Map in Python

Pass maps are an established visualisation in football analysis, used to show the area of the pitch where a player made their passes. You’ll find examples across the Football Manager series, TV coverage, and pretty much all formats of football journalism. Similar plots are used to show shots or other events in a game, and multiple other sports make use of similar maps of what goes on during a game. This article runs through one way to create these in Python, making use of the Matplotlib library. Let’s fire up our modules, open our dataset and take a look at what we are working with:

In [20]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Arc

%matplotlib inline

data = pd.read_csv("EventData/passes.csv")

data.head()
Out[20]:
Half Time Event Player Team Xstart Ystart Xend Yend
0 First Half 67.0 Pass Zeedayne France 12 3 118 65
1 First Half 70.2 Pass Zeedayne France 82 30 72 26
2 First Half 78.5 Pass Zeedayne France 1 3 69 73
3 First Half 106.5 Pass Zeedayne France 41 46 117 60
4 First Half 115.7 Pass Zeedayne France 34 24 4 20

*** Plotting Lines

Our dataset contains Zeedayne’s passes from her match. We have when they happened, in additon to the starting and ending X and Y locations. With this information, matplotlib makes it easy to draw lines. We can use the ‘.plot()’ function to draw lines if we give it two lists:

  • List one must contain the start and end X locations
  • List two gives the start and end Y locations

For example, plt.plot([0,1],[2,3] will plot a line from location (0,2) to (1,3).

We could write this line to plot each of Zeedayne’s passes, but we hate repeating ourselves and are a little bit lazy, so let’s use a for loop to do this. Take a look at our code below to see it in action:

In [25]:
fig, ax = plt.subplots()
fig.set_size_inches(7, 5)

for i in range(len(data)):
    plt.plot([int(data["Xstart"][i]),int(data["Xend"][i])],
             [int(data["Ystart"][i]),int(data["Yend"][i])], 
             color="blue")
    
plt.show()

Great job on plotting all of the passes! Unfortunately, we do not know where they happened on the pitch, or the direction, or much else, but we will get there!

Let’s start with adding a circle at the starting point of each pass to understand the direction. This is as easy as before, we just plot the start data, like below:

In [29]:
fig, ax = plt.subplots()
fig.set_size_inches(7, 5)

for i in range(len(data)):
    plt.plot([int(data["Xstart"][i]),int(data["Xend"][i])],
             [int(data["Ystart"][i]),int(data["Yend"][i])], 
             color="blue")
    
    plt.plot(int(data["Xstart"][i]),int(data["Ystart"][i]),"o", color="green")
    
plt.show()

Another massive and easy improvement would be to add a pitch map – as our article here explains. Let’s steal the code and add the pitch here – obviously feel free to steal the pitch too!

In [27]:
#Create figure
fig=plt.figure()
fig.set_size_inches(7, 5)
ax=fig.add_subplot(1,1,1)

#Pitch Outline & Centre Line
plt.plot([0,0],[0,90], color="black")
plt.plot([0,130],[90,90], color="black")
plt.plot([130,130],[90,0], color="black")
plt.plot([130,0],[0,0], color="black")
plt.plot([65,65],[0,90], color="black")

#Left Penalty Area
plt.plot([16.5,16.5],[65,25],color="black")
plt.plot([0,16.5],[65,65],color="black")
plt.plot([16.5,0],[25,25],color="black")

#Right Penalty Area
plt.plot([130,113.5],[65,65],color="black")
plt.plot([113.5,113.5],[65,25],color="black")
plt.plot([113.5,130],[25,25],color="black")

#Left 6-yard Box
plt.plot([0,5.5],[54,54],color="black")
plt.plot([5.5,5.5],[54,36],color="black")
plt.plot([5.5,0.5],[36,36],color="black")

#Right 6-yard Box
plt.plot([130,124.5],[54,54],color="black")
plt.plot([124.5,124.5],[54,36],color="black")
plt.plot([124.5,130],[36,36],color="black")

#Prepare Circles
centreCircle = plt.Circle((65,45),9.15,color="black",fill=False)
centreSpot = plt.Circle((65,45),0.8,color="black")
leftPenSpot = plt.Circle((11,45),0.8,color="black")
rightPenSpot = plt.Circle((119,45),0.8,color="black")

#Draw Circles
ax.add_patch(centreCircle)
ax.add_patch(centreSpot)
ax.add_patch(leftPenSpot)
ax.add_patch(rightPenSpot)

#Prepare Arcs
leftArc = Arc((11,45),height=18.3,width=18.3,angle=0,theta1=310,theta2=50,color="black")
rightArc = Arc((119,45),height=18.3,width=18.3,angle=0,theta1=130,theta2=230,color="black")

#Draw Arcs
ax.add_patch(leftArc)
ax.add_patch(rightArc)

#Tidy Axes
plt.axis('off')

for i in range(len(data)):
    plt.plot([int(data["Xstart"][i]),int(data["Xend"][i])],[int(data["Ystart"][i]),int(data["Yend"][i])], color="blue")
    plt.plot(int(data["Xstart"][i]),int(data["Ystart"][i]),"o", color="green")

#Display Pitch
plt.show()

Awesome, now we can see Zeedayne’s pass locations – seems to cover just about everywhere!

Summary

Plotting simple pass maps is pretty easy – we just need to use matplotlib’s ‘.plot’ functionality to draw our lines, and a for loop to run through X/Y origin and destiniation data to plot each line.

On their own, they do not offer much information, but once we add start location and a pitch map, we start to see where a player played their passes, where they ended up and the range that they employed in the match.

To develop on this, we can look to colour code our lines for success, or another variable. We could even look to plot a heatmap to show where a player was active. Watch out for a further article on these!

Posted by FCPythonADMIN in Visualisation

Drawing a Pitchmap – Adding Lines & Circles in Matplotlib

There are lots of reasons why we might want to draw a line or circle on our charts. We could look to add an average line, highlight a key data point or even draw a picture. This article will show how to add lines, circles and arcs with the example of a football pitch map that could then be used to show heatmaps, passes or anything else that happens during a match.

This example works with FIFA’s offical pitch sizes, but you might want to change them according to your data/sport/needs. Let’s import matplotlib as normal, in addition to its Arc functionality.

In [1]:
import matplotlib.pyplot as plt
from matplotlib.patches import Arc

Drawing Lines

It is easiest for us to start with our lines around the outside of the pitch. Once we create our plot with the first two lines of our code, drawing a line is pretty easy with ‘.plot’. You have probably already seen ‘.plot’ used to display scatter points, but to draw a line, we just need to provide two lists as arguments and matplotlib will do the thinking for us:

  • List one: starting and ending X locations
  • List two: starting and ending Y locations

Take a look at the code and plot below to understand our outlines. Use the colour guides to see how they are plotted with start and end point lists.

In [2]:
fig=plt.figure()
ax=fig.add_subplot(1,1,1)

plt.plot([0,0],[0,90], color="blue")
plt.plot([0,130],[90,90], color="orange")
plt.plot([130,130],[90,0], color="green")
plt.plot([130,0],[0,0], color="red")
plt.plot([65,65],[0,90], color="pink")

plt.show()

Great job! Matplotlib makes drawing lines very easy, it just takes some clear thinking with start and end locations to get them plotted.

Drawing Circles

Next up, we’re going to draw some circles on the pitch. Primarily, we need a centre circle, but we also need markers for the centre and penalty spots.

Adding circles is slightly different to lines. Firstly, we need to assign our circles to a variable. We use ‘.circle’ to do this, passing it two essential arguments:

  • X/Y coordinates of the middle of the circle
  • Radius of the circle

For our circles, we’ll also assign colour and fill, but these are optional.

With these circles assigned then use ‘.patch’ to draw the circle to our plot.

Take a look at our code below:

In [3]:
#Create figure
fig=plt.figure()
ax=fig.add_subplot(1,1,1)

#Pitch Outline & Centre Line
plt.plot([0,0],[0,90], color="black")
plt.plot([0,130],[90,90], color="black")
plt.plot([130,130],[90,0], color="black")
plt.plot([130,0],[0,0], color="black")
plt.plot([65,65],[0,90], color="black")

#Assign circles to variables - do not fill the centre circle!
centreCircle = plt.Circle((65,45),9.15,color="red",fill=False)
centreSpot = plt.Circle((65,45),0.8,color="blue")

#Draw the circles to our plot
ax.add_patch(centreCircle)
ax.add_patch(centreSpot)


plt.show()

Drawing Arcs

Now that you can create circles, arcs will be just as easy – we’ll need them for the lines outside the penalty area. While they take a few more arguments, they follow the same pattern as before. Let’s go through the arguments:

  • X/Y coordinates of the centrepoint of the arc, assuming the arc was a complete shape.
  • Width – we must pass width and height as the arc might not be a circle, it might instead be from an oval shape
  • Height – as above
  • Angle – degree rotation of the shape (anti-clockwise)
  • Theta1 – start location of the arc, in degrees
  • Theta2 – end location of the arc, in degrees

That’s a few more arguments than for the circle and lines, but don’t let that make you think that this is too much more complicated. Our code will look like this for one arc:

leftArc = Arc((11,45),height=18.3,width=18.3,angle=0,theta1=310,theta2=50)

All that we need to do after this is draw the arc to our plot, just like with the circles:

ax.add_patch(leftArc)

You can see this in action below:

In [4]:
#Demo Arcs
 
#Create figure
fig=plt.figure()
ax=fig.add_subplot(1,1,1)

#Pitch Outline & Centre Line
plt.plot([0,0],[0,90], color="black")
plt.plot([0,130],[90,90], color="black")
plt.plot([130,130],[90,0], color="black")
plt.plot([130,0],[0,0], color="black")
plt.plot([65,65],[0,90], color="black")

#Left Penalty Area
plt.plot([16.5,16.5],[65,25],color="black")
plt.plot([0,16.5],[65,65],color="black")
plt.plot([16.5,0],[25,25],color="black")

#Centre Circle/Spot
centreCircle = plt.Circle((65,45),9.15,fill=False)
centreSpot = plt.Circle((65,45),0.8)
ax.add_patch(centreCircle)
ax.add_patch(centreSpot)

#Create Arc and add it to our plot
leftArc = Arc((11,45),height=18.3,width=18.3,angle=0,theta1=310,theta2=50,color="red")

ax.add_patch(leftArc)

plt.show()

Bringing everything together

The code below applies the above lines, cricles and arcs to a function for quick and easy use. The only new line removes our axes:

plt.axis(‘off’)

Take a look through our function belong and follow what we are doing. Feel free to take this and use it as the base for your own plots!

In [5]:
def createPitch():
    
    #Create figure
    fig=plt.figure()
    ax=fig.add_subplot(1,1,1)

    #Pitch Outline & Centre Line
    plt.plot([0,0],[0,90], color="black")
    plt.plot([0,130],[90,90], color="black")
    plt.plot([130,130],[90,0], color="black")
    plt.plot([130,0],[0,0], color="black")
    plt.plot([65,65],[0,90], color="black")
    
    #Left Penalty Area
    plt.plot([16.5,16.5],[65,25],color="black")
    plt.plot([0,16.5],[65,65],color="black")
    plt.plot([16.5,0],[25,25],color="black")
    
    #Right Penalty Area
    plt.plot([130,113.5],[65,65],color="black")
    plt.plot([113.5,113.5],[65,25],color="black")
    plt.plot([113.5,130],[25,25],color="black")
    
    #Left 6-yard Box
    plt.plot([0,5.5],[54,54],color="black")
    plt.plot([5.5,5.5],[54,36],color="black")
    plt.plot([5.5,0.5],[36,36],color="black")
    
    #Right 6-yard Box
    plt.plot([130,124.5],[54,54],color="black")
    plt.plot([124.5,124.5],[54,36],color="black")
    plt.plot([124.5,130],[36,36],color="black")
    
    #Prepare Circles
    centreCircle = plt.Circle((65,45),9.15,color="black",fill=False)
    centreSpot = plt.Circle((65,45),0.8,color="black")
    leftPenSpot = plt.Circle((11,45),0.8,color="black")
    rightPenSpot = plt.Circle((119,45),0.8,color="black")
    
    #Draw Circles
    ax.add_patch(centreCircle)
    ax.add_patch(centreSpot)
    ax.add_patch(leftPenSpot)
    ax.add_patch(rightPenSpot)
    
    #Prepare Arcs
    leftArc = Arc((11,45),height=18.3,width=18.3,angle=0,theta1=310,theta2=50,color="black")
    rightArc = Arc((119,45),height=18.3,width=18.3,angle=0,theta1=130,theta2=230,color="black")

    #Draw Arcs
    ax.add_patch(leftArc)
    ax.add_patch(rightArc)
    
    #Tidy Axes
    plt.axis('off')
    
    #Display Pitch
    plt.show()
    
createPitch()

Summary

In our article, we’ve seen how to draw lines, arcs and circles in Matplotlib. You’ll find this useful when trying to add the finishing touches with annotations to any plot. These tools are equally important when drawing a map on which we will plot our data – like our pitchmap example here.

Take a look at our other visualisation articles here and be sure to get in touch with us on Twitter!

Posted by FCPythonADMIN in Visualisation

Radar Charts in Matplotlib

In football analysis and video games, radar charts have been popularised in a number of places, from the FIFA series, to Ted Knutson’s innovative ways of displaying player data.

Radar charts are an engaging way to show data that typically piques more attention than a bar chart although you can often use both of these to show the same data.

This article runs through the creation of basic radar charts in Matplotlib, plotting the FIFA Ultimate Team data of a couple of players, before creating a function to streamline the process. To start, let’s get our libraries and data pulled together.

In [1]:
import pandas as pd
from math import pi
import matplotlib.pyplot as plt
%matplotlib inline

#Create a data frame from Messi and Ronaldo's 6 Ultimate Team data points from FIFA 18
Messi = {'Pace':89,'Shooting':90,'Passing':86,'Dribbling':95,'Defending':26,'Physical':61}
Ronaldo = {'Pace':90,'Shooting':93,'Passing':82,'Dribbling':90,'Defending':33,'Physical':80}

data = pd.DataFrame([Messi,Ronaldo], index = ["Messi","Ronaldo"])
data
Out[1]:
Defending Dribbling Pace Passing Physical Shooting
Messi 26 95 89 86 61 90
Ronaldo 33 90 90 82 80 93

Plotting data in a radar has lots of similarities to plotting along a straight line (like a bar chart). We still need to provide data on where our line goes, we need to label our axes and so on. However, as it is a circle, we will also need to provide the angle at which the lines run. This is much easier than it sounds with Python.

Firstly, let’s do the easy bits and take a list of Attributes for our labels, along with a basic count of how many there are.

In [2]:
Attributes =list(data)
AttNo = len(Attributes)

We then take a list of the values that we want to plot, then copy the first value to the end. When we plot the data, this will be the line that the radat follows – take a look below:

In [3]:
values = data.iloc[1].tolist()
values += values [:1]
values
Out[3]:
[33, 90, 90, 82, 80, 93, 33]

So these are the point that we will draw on our radar, but we will need to find the angles between each point for our line to follow. The formula below finds these angles and assigns them to ‘angles’. Then, just as above, we copy the first value to the end of our array to complete the line.

In [4]:
angles = [n / float(AttNo) * 2 * pi for n in range(AttNo)]
angles += angles [:1]

Now that we have our values to plot, and the angles between them, drawing the radar is pretty simple.

Follow along with the comments below, but note the ‘polar=true’ in our subplot – this changes our chart from a more-traditional x and y axes chart, to a the circular radar chart that we are looking for.

In [5]:
ax = plt.subplot(111, polar=True)

#Add the attribute labels to our axes
plt.xticks(angles[:-1],Attributes)

#Plot the line around the outside of the filled area, using the angles and values calculated before
ax.plot(angles,values)

#Fill in the area plotted in the last line
ax.fill(angles, values, 'teal', alpha=0.1)

#Give the plot a title and show it
ax.set_title("Ronaldo")
plt.show()

Comparing two sets of data in a radar chart

One additional benefit of the radar chart is the ability to compare two observations (or players, in this case), quite easily.

The example below repeats the above process for finding angles for Messi’s data points, then plots them both together.

In [6]:
#Find the values and angles for Messi - from the table at the top of the page
values2 = data.iloc[0].tolist()
values2 += values2 [:1]

angles2 = [n / float(AttNo) * 2 * pi for n in range(AttNo)]
angles2 += angles2 [:1]


#Create the chart as before, but with both Ronaldo's and Messi's angles/values
ax = plt.subplot(111, polar=True)

plt.xticks(angles[:-1],Attributes)

ax.plot(angles,values)
ax.fill(angles, values, 'teal', alpha=0.1)

ax.plot(angles2,values2)
ax.fill(angles2, values2, 'red', alpha=0.1)

#Rather than use a title, individual text points are added
plt.figtext(0.2,0.9,"Messi",color="red")
plt.figtext(0.2,0.85,"v")
plt.figtext(0.2,0.8,"Ronaldo",color="teal")
plt.show()

Creating a function to plot individual players

This is a lot of code if we want to create multiple charts. We can easily turn these charts into a function, which will do all the heavy lifting for us – all we will have to do is provide it with a player name and data that we want to plot:

In [7]:
def createRadar(player, data):
    Attributes = ["Defending","Dribbling","Pace","Passing","Physical","Shooting"]
    
    data += data [:1]
    
    angles = [n / 6 * 2 * pi for n in range(6)]
    angles += angles [:1]
    
    ax = plt.subplot(111, polar=True)

    plt.xticks(angles[:-1],Attributes)
    ax.plot(angles,data)
    ax.fill(angles, data, 'blue', alpha=0.1)

    ax.set_title(player)
    plt.show()
In [8]:
createRadar("Dybala",[24,91,86,81,67,85])

And how about we do the same thing to compare two players?

In [9]:
def createRadar2(player, data, player2, data2):
    Attributes = ["Defending","Dribbling","Pace","Passing","Physical","Shooting"]
    
    data += data [:1]
    data2 += data2 [:1]
    
    angles = [n / 6 * 2 * pi for n in range(6)]
    angles += angles [:1]
    
    angles2 = [n / 6 * 2 * pi for n in range(6)]
    angles2 += angles2 [:1]
    
    ax = plt.subplot(111, polar=True)

    #Create the chart as before, but with both Ronaldo's and Messi's angles/values
    ax = plt.subplot(111, polar=True)

    plt.xticks(angles[:-1],Attributes)

    ax.plot(angles,values)
    ax.fill(angles, values, 'teal', alpha=0.1)

    ax.plot(angles2,values2)
    ax.fill(angles2, values2, 'red', alpha=0.1)

    #Rather than use a title, individual text points are added
    plt.figtext(0.2,0.9,player,color="teal")
    plt.figtext(0.2,0.85,"v")
    plt.figtext(0.2,0.8,player2,color="red")
    plt.show()
In [10]:
createRadar2("Henderson", [76,76,62,82,81,70],"Wilshere", [62,82,71,80,72,69])

Summary

Radar charts are an interesting way to display data and allow us to compare two observations quite nicely. In this article, we have used them to compare fictional FIFA players, but analysts have used this format very innovatively to display actual performance data in an engaging format.

Take a look at Statsbomb‘s use of radar charts with real data, or learn more about visualisation in Python here.

Posted by FCPythonADMIN in Visualisation

Creating Pie Charts in Matplotlib

I tend to think that pie charts should be avoided in 99% of the cases that they are used in. Unless your goal is to mislead (which is sometimes the case!), or you have a strict use case for them, you can normally find a better way to communicate your point.

That being said, just because we won’t do something, doesn’t mean we don’t need to know how it is done. As such, this article is going to take us through a simple example of creating a pie chart in Matplotlib.

As ever, let’s get our modules and data ready to go.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline

Our pie chart is going to display the share of Premier League wins, as shown in our data below:

In [2]:
leagueWins = {'Team':['Manchester United','Blackburn Rovers','Arsenal',
                     'Chelsea','Manchester City','Leicester City'],
             'Championships':[13,1,3,4,2,1]}

df = pd.DataFrame(leagueWins, columns=['Team','Championships'])
df
Out[2]:
Team Championships
0 Manchester United 13
1 Blackburn Rovers 1
2 Arsenal 3
3 Chelsea 4
4 Manchester City 2
5 Leicester City 1

So we want the pie chart to plot the numbers in our Championships column. ‘plt.pie()’ will do exactly that:

In [3]:
plt.pie(df['Championships'])

#This next line just makes the plot look a little cleaner in this notebook
plt.tight_layout()

So we have a pie chart! It doesn’t tell us a great deal without labels, except that there is a big blue lump that takes up over half of the pie.

As with all of its other plot types, Matplotlib gives good customisation options. Let’s use some of these to add a title, labels and colours in our arguments:

In [4]:
#Create a list of the colours used for the teams, in order.
teamColours=['#f40206','#0560b5','#ce0000','#1125ff','#28cdff','#091ebc']

plt.pie(df['Championships'],
        #Data labels are the team names in the dataFrame
       labels = df['Team'],
        #Assign our colours list
       colors = teamColours,
        #Give a tidier angle to ur first data angle
        startangle = 90
       )

#Add a title
plt.title("Premier League Titles")
plt.tight_layout()

Summary

I strongly recommend not using pie charts, we just struggle to process circular space in comparison to bar charts or even a table – especially when the numbers are relatively simple.

However, just in case it is ever needed, we have seen in this article how easy it is to create a pie chart in Matplotlib with the ‘.pie()’ command. It is also clear that we need to make use of Matplotlib’s customisation features to tidy things up, add a bit of relevant colour and titles. Passing these as arguments into the earlier command makes this easy.

Next up, read up on some different (better!) visualisation types!

Posted by FCPythonADMIN in Visualisation

Lollipop Charts in Matplotlib

Matplotlib’s chart functions are quite simple and allow us to create graphics to our exact specification.

The example below will plot the Premier League table from the 16/17 season, taking you through the basics of creating a bar chart and customising some of its features. First of all, let’s get our modules loaded and data in place.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#This next line makes our charts show up in the notebook
%matplotlib inline

table = pd.read_csv("../../data/1617table.csv")
table.head()
Out[1]:
Pos Team Pld W D L GF GA GD Pts
0 1 Chelsea 38 30 3 5 85 33 52 93
1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86
2 3 Manchester City 38 23 9 6 80 39 41 78
3 4 Liverpool 38 22 10 6 78 42 36 76
4 5 Arsenal 38 23 6 9 77 44 33 75

The .hlines() argument plots our data in horizontal lines. At its simplest, it needs two arguments, x and height.

  • X – The x coordinate for each bar. For a bar chart, we will most often want evenly spaced bars, so we provide a sequence from 1-20 for a 20 bar chart. ‘np.arange’ provides this sequence easily.
  • Height – How high will each bar go? Or, what is the value of each bar? In this example, we will provide the points column of the table.
In [2]:
plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'])
Out[2]:
<matplotlib.collections.LineCollection at 0x19771e0b048>
In [3]:
plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'],color="skyblue")
plt.plot(table['Pts'], np.arange(1,21), "o")
plt.yticks(np.arange(1,21), table['Team'])
plt.show()
In [4]:
#Create an array of equal length to our bars
#Each value is the hex code for the team's colours, in order of our chart
teamColours = ['#034694','#001C58','#5CBFEB','#D00027',
              '#EF0107','#DA020E','#274488','#ED1A3B',
               '#000000','#091453','#60223B','#0053A0',
               '#E03A3E','#1B458F','#000000','#53162f',
               '#FBEE23','#EF6610','#C92520','#BA1F1A']


plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'],color=teamColours)
plt.plot(table['Pts'], np.arange(1,21), "o")
plt.yticks(np.arange(1,21), table['Team'])

plt.ylabel("Team")
plt.xlabel("Points")

plt.title("Premier League 16/17")

plt.show()

Top work, you’ve created a bar chart! It shows team points evenly spaced and looks great.

To show this, however, we need to add a few more things. Notably, axis labels, a title and bar labels. You can see which commands do this for us in the code below:

That is so much better! Now we can pass this chart to anyone and they can understand it.

It is still a bit… blue, though. Let’s give the bars their team’s colour. First of all, we will need to create an array of team colours using hex codes. We will then map this array to each team. Take a look how below:

And now we have a beautiful, in colour, chart. Exceptional work!

Summary

Lollipop charts are essentially horizontal bar charts, with a circle dotted on the end. As we can see above, fully labelled and properly drawn up charts can make for a nice-looking change from a typical bar chart.

Next up, take a look through some other visualisation types – like radar charts!

Posted by FCPythonADMIN in Visualisation

Simple Bar Charts in Matplotlib

Matplotlib’s chart functions are quite simple and allow us to create graphics to our exact specification.

The example below will plot the Premier League table from the 16/17 season, taking you through the basics of creating a bar chart and customising some of its features. First of all, let’s get our modules loaded and data in place.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#This next line makes our charts show up in the notebook
%matplotlib inline

table = pd.read_csv("../../data/1617table.csv")
table.head()
Out[1]:
Pos Team Pld W D L GF GA GD Pts
0 1 Chelsea 38 30 3 5 85 33 52 93
1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86
2 3 Manchester City 38 23 9 6 80 39 41 78
3 4 Liverpool 38 22 10 6 78 42 36 76
4 5 Arsenal 38 23 6 9 77 44 33 75