bar chart

Making Better Python Visualisations

FC Python recently received a tweet from @fitbawnumbers applying our lollipop chart code to Pep’s win percentage. It was great to see this application of the chart, and especially interesting because Philip then followed up with another chart showing the same data from Excel. To be blunt, the Excel chart was much cleaner/better than our lollipop charts – Philip had done a great job with it.

This has inspired us to put together a post exploring some of matplotlib’s customisation options and principles that underpin them. Hopefully this will give us a better looking and more engaging chart!

As a reminder, this is the chart that we are dealing with improving, and you can find the tutorial for lollipop charts here.

Step One – Remove everything that adds nothing

There is clearly lots that we can improve on here. Let’s start with the basics – if you can remove something without damaging your message, remove it. We have lots of ugly lines here, let’s remove the box needlessly around our data, along with those ticks. Likewise the axes labels, we know that the y axis shows teams – so let’s bin that too. We’ll do this with the following code:

In [ ]:
#For every side of the box, set to invisible

for side in ['right','left','top','bottom']:
    ax.spines[side].set_visible(False)
    
#Remove the ticks on the x and y axes

for tic in ax.xaxis.get_major_ticks():
    tic.tick1On = tic.tick2On = False

for tic in ax.yaxis.get_major_ticks():
    tic.tick1On = tic.tick2On = False

Step Two – Where appropriate, change the defaults

Philip’s Excel chart looked great because it didn’t look like an Excel chart. He had changed all of the defaults: the colours, the font, the label location. Subsequently, it doesn’t look like the charts that have bored us to death in presentations for decades. So let’s change our title locations and fonts to make it look like we’ve put some effort in beyond the defaults. Code below:

In [ ]:
#Change font
plt.rcParams["font.family"] = "DejaVu Sans"

#Instead of use plt.title, we'll use plt.text to fully customise it
#First two arguments are x/y location
plt.text(55, 19,"Premier League 16/17", size=18, fontweight="normal")

Step Three – Add labels if they are clean and give detail

While the lollipop chart makes it easy to understand the differences between teams, our orignal chart requires users to look all the way down if they want the value. Even then, the audience has to make a rough estimation. Why not add values to make everything a bit cleaner?

We can easily iterate through our values in the dataframe and plot them alongside the charts. The code below uses ‘enumerate()’ to count through each of the values in the points column of our table. For each value, it writes text at location v,i (nudged a bit with the sums below). Take a look at the for loop:

In [ ]:
for i, v in enumerate(table['Pts']):
    ax.text(v+2, i+0.8, str(v), color=teamColours[i], size = 13)

Step Four – Improve aesthetics with strong colour against off-white background

Our lollipop sticks are very, very thin. We can improve the look of these by giving them a decent thickness and a block of bold colour. Underneath this colour, we should add an off-white colour. This differentiates the plot from the rest of the page, and makes it look a lot more professional. Next time you see a great plot, take note of the base colour and try to understand the effect that this has on the plot and article as a whole!

Our code for doing these two things is below:

In [ ]:
#Set a linewidth in our hlines argument
plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'],color=teamColours,linewidths=10)

#Set a background colour to the data area background and the plot as a whole
ax.set_facecolor('#f7f4f4')
fig.patch.set_facecolor('#f7f4f4')

Fitting it all together

Putting all of these steps together, we get something like the following. Follow along with the comments and see what fits in where:

In [1]:
#Set our plot and desired size
fig = plt.figure(figsize=(10,7))
ax = plt.subplot()

#Change our font
plt.rcParams["font.family"] = "DejaVu Sans"

#Each value is the hex code for the team's colours, in order of our chart
teamColours = ['#034694','#001C58','#5CBFEB','#D00027',
              '#EF0107','#DA020E','#274488','#ED1A3B',
               '#000000','#091453','#60223B','#0053A0',
               '#E03A3E','#1B458F','#000000','#53162f',
               '#FBEE23','#EF6610','#C92520','#BA1F1A']

#Plot our thicker lines and team names
plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'],color=teamColours,linewidths=10)
plt.yticks(np.arange(1,21), table['Team'])

#Label our axes as needed and title the plot
plt.xlabel("Points")
plt.text(55, 19,"Premier League 16/17", size=18, fontweight="normal")

#Add the background colour
ax.set_facecolor('#f7f4f4')
fig.patch.set_facecolor('#f7f4f4')

for side in ['right','left','top','bottom']:
    ax.spines[side].set_visible(False)

for tic in ax.xaxis.get_major_ticks():
    tic.tick1On = tic.tick2On = False

for tic in ax.yaxis.get_major_ticks():
    tic.tick1On = tic.tick2On = False
    
for i, v in enumerate(table['Pts']):
    ax.text(v+2, i+0.8, str(v), color=teamColours[i], size = 13)

plt.show()

Without doubt, this is a much better looking chart than the lollipop. Not only does it look better, but it gives us more information and communicates better than our former effort. Thank you Philip for the inspiration!

Summary

This article has looked at a few ways to tidy our charts. The rules that we introduced throughout should be applied to any visualisation that you’re looking to communicate with. Ensure that your charts are as clean as possible, are labelled and stray away from defaults. Follow these, and you’ll be well on your way to creating great plots!

Why not apply these rules to some of the other basic examples in our visualisation series and let us know how you improve on our articles!

Posted by FCPythonADMIN in Blog, Visualisation

Lollipop Charts in Matplotlib

Matplotlib’s chart functions are quite simple and allow us to create graphics to our exact specification.

The example below will plot the Premier League table from the 16/17 season, taking you through the basics of creating a bar chart and customising some of its features. First of all, let’s get our modules loaded and data in place.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#This next line makes our charts show up in the notebook
%matplotlib inline

table = pd.read_csv("../../data/1617table.csv")
table.head()
Out[1]:
Pos Team Pld W D L GF GA GD Pts
0 1 Chelsea 38 30 3 5 85 33 52 93
1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86
2 3 Manchester City 38 23 9 6 80 39 41 78
3 4 Liverpool 38 22 10 6 78 42 36 76
4 5 Arsenal 38 23 6 9 77 44 33 75

The .hlines() argument plots our data in horizontal lines. At its simplest, it needs two arguments, x and height.

  • X – The x coordinate for each bar. For a bar chart, we will most often want evenly spaced bars, so we provide a sequence from 1-20 for a 20 bar chart. ‘np.arange’ provides this sequence easily.
  • Height – How high will each bar go? Or, what is the value of each bar? In this example, we will provide the points column of the table.
In [2]:
plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'])
Out[2]:
<matplotlib.collections.LineCollection at 0x19771e0b048>
In [3]:
plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'],color="skyblue")
plt.plot(table['Pts'], np.arange(1,21), "o")
plt.yticks(np.arange(1,21), table['Team'])
plt.show()
In [4]:
#Create an array of equal length to our bars
#Each value is the hex code for the team's colours, in order of our chart
teamColours = ['#034694','#001C58','#5CBFEB','#D00027',
              '#EF0107','#DA020E','#274488','#ED1A3B',
               '#000000','#091453','#60223B','#0053A0',
               '#E03A3E','#1B458F','#000000','#53162f',
               '#FBEE23','#EF6610','#C92520','#BA1F1A']


plt.hlines(y=np.arange(1,21),xmin=0,xmax=table['Pts'],color=teamColours)
plt.plot(table['Pts'], np.arange(1,21), "o")
plt.yticks(np.arange(1,21), table['Team'])

plt.ylabel("Team")
plt.xlabel("Points")

plt.title("Premier League 16/17")

plt.show()

Top work, you’ve created a bar chart! It shows team points evenly spaced and looks great.

To show this, however, we need to add a few more things. Notably, axis labels, a title and bar labels. You can see which commands do this for us in the code below:

That is so much better! Now we can pass this chart to anyone and they can understand it.

It is still a bit… blue, though. Let’s give the bars their team’s colour. First of all, we will need to create an array of team colours using hex codes. We will then map this array to each team. Take a look how below:

And now we have a beautiful, in colour, chart. Exceptional work!

Summary

Lollipop charts are essentially horizontal bar charts, with a circle dotted on the end. As we can see above, fully labelled and properly drawn up charts can make for a nice-looking change from a typical bar chart.

Next up, take a look through some other visualisation types – like radar charts!

Posted by FCPythonADMIN in Visualisation

Simple Bar Charts in Matplotlib

Matplotlib’s chart functions are quite simple and allow us to create graphics to our exact specification.

The example below will plot the Premier League table from the 16/17 season, taking you through the basics of creating a bar chart and customising some of its features. First of all, let’s get our modules loaded and data in place.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#This next line makes our charts show up in the notebook
%matplotlib inline

table = pd.read_csv("../../data/1617table.csv")
table.head()
Out[1]:
Pos Team Pld W D L GF GA GD Pts
0 1 Chelsea 38 30 3 5 85 33 52 93
1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86
2 3 Manchester City 38 23 9 6 80 39 41 78
3 4 Liverpool 38 22 10 6 78 42 36 76
4 5 Arsenal 38 23 6 9 77 44 33 75

The .bar() argument plots our data. At its simplest, it needs two arguments, x and height.

  • X – The x coordinate for each bar. For a bar chart, we will most often want evenly spaced bars, so we provide a sequence from 1-20 for a 20 bar chart. ‘np.arange’ provides this sequence easily.
  • Height – How high will each bar go? Or, what is the value of each bar? In this example, we will provide the points column of the table.
In [2]:
plt.bar(x=np.arange(1,21),height=table['Pts'])
Out[2]:
<Container object of 20 artists>

Top work, you’ve created a bar chart! It shows team points evenly spaced and looks great.

To show this, however, we need to add a few more things. Notably, axis labels, a title and bar labels. You can see which commands do this for us in the code below:

In [3]:
#Create our bar chart as before
plt.bar(x=np.arange(1,21),height=table['Pts'])

#Give it a title
plt.title("Premier League 16/17")

#Give the x axis some labels across the tick marks.
#Argument one is the position for each label
#Argument two is the label values and the final one is to rotate our labels
plt.xticks(np.arange(1,21), table['Team'], rotation=90)

#Give the x and y axes a title
plt.xlabel("Team")
plt.ylabel("Points")

#Finally, show me our new plot
plt.show()

That is so much better! Now we can pass this chart to anyone and they can understand it.

It is still a bit… blue, though. Let’s give the bars their team’s colour. First of all, we will need to create an array of team colours using hex codes. We will then map this array to each team. Take a look how below:

In [4]:
#Create an array of equal length to our bars
#Each value is the hex code for the team's colours, in order of our chart
teamColours = ['#034694','#001C58','#5CBFEB','#D00027',
              '#EF0107','#DA020E','#274488','#ED1A3B',
               '#000000','#091453','#60223B','#0053A0',
               '#E03A3E','#1B458F','#000000','#53162f',
               '#FBEE23','#EF6610','#C92520','#BA1F1A']

#Add a new argument, color, to our 'plt.bar()' method
#This argument passes our teamColours array
plt.bar(x=np.arange(1,21),height=table['Pts'],color = teamColours)

#Label bars, axes and the chart as before
plt.title("Premier League 16/17")
plt.xticks(np.arange(1,21), table['Team'], rotation=90)
plt.xlabel("Team")
plt.ylabel("Points")
plt.show()

And now we have a beautiful, in colour, chart. Exceptional work!

Summary

While the table is the customary way to display team performance over a season, it hides a lot of information that we struggle to visualise as numbers. When we plot points onto a chart we can see differences between teams much more easily.

We used matplotlib’s ‘.bar()’ tool to create a simple barchart, then to add titles, axes labels and even colour to make something that we can present easily.

Next up, take a look at another way to present this data with a lollipop chart.

Posted by FCPythonADMIN in Visualisation