Scatter plots are fantastic visualisations for showing the relationship between variables. They plot two series of data, one across each axis, which allow for a quick look to check for any relationship.

Seaborn allows us to make really nice-looking visuals with little effort once our data is ready. Let’s get our modules and data fired up and kick off.

In [1]:

import seaborn as sns
import pandas as pd
%matplotlib inline

df = pd.read_csv("../../Data/FIFAPlayers.csv")

df.head(2)

Out[1]:

	player_api_id	overall_rating	potential	preferred_foot	attacking_work_rate	defensive_work_rate	crossing	finishing	heading_accuracy	short_passing	…	gk_diving	gk_handling	gk_kicking	gk_positioning	gk_reflexes	player_name	birthday	p_id	height	weight
0	307224	64	68	right	medium	low	44	63	73	49	…	12	12	7	11	12	Kevin Koubemba	23/03/1993 00:00	307224	193.04	198
1	512726	63	72	right	medium	medium	51	66	55	57	…	11	12	12	12	7	Yanis Mbombo Lokwa	08/04/1994 00:00	512726	177.80	172

2 rows Ã— 44 columns

Our data shows skill ratings across a number of attributes for lots and lots of players. In this article, we want to try and ascertain some relationships between this attributes.

Seaborn has a few ways to show scatter plots, and we'll focus on 'regplot()'. Let's start with a plot that should show a strong positive correlation - height and weight.

In [2]:

sns.regplot(x="height",y="weight",data=df)

Out[2]:

<matplotlib.axes._subplots.AxesSubplot at 0x1c879492828>

So we can indeed see that there is a relationship between height and weight – as you’d expect, the taller you are, the heavier we can expect you to be. The line is a guess of where you would expect a future height or weight to end up.

The huge outlier in the top right is identified at the end of the article!

Our plot is created pretty easily. ‘.regplot()’ needed just 3 arguments here:

X – The data along the x axis
Y – The data along the y axis
Data – The dataframe we are reading from

As with all Seaborn plots, there are some pretty cool customisation options. Let’s take a look at some examples:

In [3]:

sns.regplot(x="finishing",y="gk_handling",data=df,
           color="green")

Out[3]:

<matplotlib.axes._subplots.AxesSubplot at 0x1c879492748>

So this is a really odd one! But we can see that there is a big difference between two groups – it is probably fair to assume that the two groups are goalkeepers and outfield players.

Although we have a surprise elite finisher, with some goalkeeping ability…

SuarezvGhana.gif

Anyway, you can see that we can change the colours with the ‘color’ argument! Let’s change the ‘alpha’ next – this makes the dots see-through and shows how many values are on top of each other.

In [4]:

sns.regplot(x="long_passing",y="short_passing",data=df,
           scatter_kws={'alpha':0.07})

Out[4]:

<matplotlib.axes._subplots.AxesSubplot at 0x1c8793b4588>

This goes inside a dictionary called ‘scatter_kws’. This dictionary gives details specifically about the plot points, rather than the chart as a whole.

Multiple scatter plots & sizing

If you have a variable that you want to further split your data by, rather than create new visualisations entirely, you may want to create a grid of scatter plots.

Seaborn allows you to do this by specifcying ‘col’ and ‘row’ arguments according to the splits you want to see.

In [5]:

sns.lmplot(x="crossing",y="finishing",data=df,
           scatter_kws={'alpha':0.1},
           col="preferred_foot")

Out[5]:

<seaborn.axisgrid.FacetGrid at 0x1c8799b8630>

As you add more plots, the overall footprint of your chart is likely to get unmanageable.

In [6]:

sns.lmplot(x="crossing",y="finishing",data=df,
           scatter_kws={'alpha':0.1},
           col="preferred_foot",
           row="attacking_work_rate",
           aspect=2, size=2
           )

Out[6]:

<seaborn.axisgrid.FacetGrid at 0x1c879e634e0>

Summary

We have seen how easily Seaborn makes good looking plots with minimum effort. ‘.regplot()’ takes just a few arguments to plot data along the x and y axes, which we can then customise with further information.

Develop your abilities on scatter plots with a look at further customisation options & other plot types.

Lots of the plots in this piece are also created for the sake of creating them – make sure that your charts carry more insight than mine!

And our really, really tall player from early in the article is, of course, Kristof van Hout!

scatter plot Seaborn