NumPy is a fundamental package for data analysis in Python as the majority of other packages in the Python data eco-system build on it. Subsequently, it makes sense for us to have an understanding of what NumPy can help us with and its general principles.

In the following article, we’ll take a look at arrays in Python – which essentially take the ‘lists’ data type to a new level. We’ll have powerful new methods, random number generation and a way of storing data in grid-like structures, not just lists like we have seen.

Let’s get things started and import the numpy library. Take a read here if you need to install it!

In [1]:
import numpy as np

Creating a NumPy array

Firstly, we need to create our array. We have a number of different ways to do this.

One way is to convert a pre-existing list into an array. Below, we do this to create a 1d array (one line) and a 2d array (a grid, or matrix).

In [2]:
#Three lists, one for GK heights, one for GK weights, one for names

GKNames = ["Kaller","Fradeel","Hayward","Honeyman"]
GKHeights = [184,188,191,193]
GKWeights = [81,85,103,99]

#Create an array of names

print(np.array(GKNames))

#Create a matrix of all three lists, start with a list of lists

GKMatrix = [GKNames,GKHeights,GKWeights]
print(np.array(GKMatrix))
['Kaller' 'Fradeel' 'Hayward' 'Honeyman']
[['Kaller' 'Fradeel' 'Hayward' 'Honeyman']
 ['184' '188' '191' '193']
 ['81' '85' '103' '99']]

There we have two examples of creating arrays from a list. Our second one is particularly cool – is just like a spreadsheet and will make our data much easier to deal with.

Aside from creating our own arrays from lists we already have, numpy can create them with its own methods:

In [3]:
#With 'arange', we can create arrays just like we created lists with 'range'
#This gives us an array ranging from the numbers in the arguments

np.arange(0,12)
Out[3]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
In [4]:
#Want a blank array? Create it full of zeros with 'zeros'
#The argument within it create the shape of a 2d or 3d array

np.zeros((3,11))
Out[4]:
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])
In [5]:
#Hate zeros? Why not use 'ones'?!

np.ones((3,11))
Out[5]:
array([[ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.]])
In [6]:
#Creating dummy data or need a random number?
#randint and randn are useful here

#Creates random numbers around a standard distribution from 0
#The argument gives us the array's shape
print(np.random.randn(3,3))

#Creates random numbers between two numbers that we give it
#The third argument gives us the shape of the array
print(np.random.randint(1,100,(3,3)))
[[ 1.1403024  -1.76082025 -0.71738168]
 [-0.44740344 -0.16392845  1.04022957]
 [ 1.97068835  0.50075891 -0.33750378]]
[[70 28 67]
 [19 54 11]
 [ 9 34 67]]

Looking for more ways to create arrays? Take a look in the documentation for ‘rand’, ‘linspace’, ‘eye’ and others!

Array Methods

Not only does NumPy give us a good way to store our data, it also gives us some great tools to simplify working with it.

Let’s find the tallest goalkeeper from our earlier examples with array methods.

In [7]:
#Three lists, one for GK heights, one for GK weights, one for names
#Create an array with each list

GKNames = ["Kaller","Fradeel","Hayward","Honeyman"]
GKHeights = [184,188,191,193]
GKWeights = [81,85,103,99]

np.array(GKNames)
GKHeights = np.array(GKHeights)
np.array(GKWeights)

#What is the largest height, .max()?

GKHeights.max()
Out[7]:
193
In [8]:
#What location is the max, .argmax()?

GKHeights.argmax()
Out[8]:
3
In [9]:
#Can I use this method to locate the player's name?
#Instead of a number in the square brackets, I can just put this method

GKNames[GKHeights.argmax()]
Out[9]:
'Honeyman'

With only four players this is a bit long-winded, but I’m sure that you can see the benefit if we have a whole academy of players and we need to find our tallest player from 100s. Swap the max to min to find the smallest value in an array.

Summary

You are likely to use NumPy with all sorts of packages as you develop your Python skills. Having a healthy appreciation of how it works, especially with arrays, will save you lots of headaches down the line.

In this page, we saw how we can create them from scratch, or convert them from lists. We created flat, 1-d arrays and 2-d grids. We then applied methods to find highest datapoints and even used these to navigate our grid. Great work! Take a look at our extension on NumPy arrays here to learn more.