Understanding NumPy Arrays in Python

Mariam YusuffMariam Yusuff
10 min read

Imagine you need to cook jollof rice for a family party. You could use a small pot and cook in batches, but that would take forever. Instead, you could use a large pot and enough utensils, which are tools that can handle the scale.

Python lists are like that small pot — useful for small tasks. But when you have large amounts of data or need to perform complex calculations, you need something bigger and stronger (NumPy Arrays).

NumPy Arrays can be used to work with large chunks of numbers and even allow us to perform arithmetic operations on them, which is often impossible with Python lists.

With NumPy Arrays, you can seamlessly perform mathematical operations on entire arrays without needing to write complex loops.

By the end of this article, you’ll understand:

  • What NumPy is

  • How NumPy arrays work

  • Some useful NumPy methods

💡
Before you dive in, it’s best to have a foundational understanding of Python.

If you’re ready, let’s explore why NumPy Arrays are the power tools that can take your Python skills to the next level.


What Is NumPy?

NumPy, short for Numerical Python, is an open-source Python library used for numerical computing. It is useful for analyzing datasets and comes in handy in several fields.

For example, it helps financial analysts identify market trends, engineers perform numerical simulations, and healthcare professionals analyze medical data. In short, wherever there’s data, NumPy is a game changer.

And analyzing data is important as it helps professionals make informed predictions based on previous events.

NumPy is especially important for professionals like machine learning engineers and data analysts, who often manipulate large datasets. NumPy is used to work on ndarrays (N-dimensional arrays) or simply arrays.

What Are NumPy Arrays?

Now let's talk about arrays. A NumPy Array is a data structure that takes one homogeneous dataset. They can be 1-dimensional (vectors), or 2-dimensional (matrices), or 3-dimensional (multi-dimensional).

Think of arrays as the faster, smarter cousin of Python lists, built for speed and efficiency when working with numbers.

Dimensions of a NumPy Array

  1. 1-dimensional (1D) arrays: These are linear, on one line. They are vectors.

     [ 4 6 8 10 30 32 34]
    
  2. 2-dimensional (2D) arrays: These are usually in matrix format; they have rows and columns. Rows are horizontal sets of elements stacked on one another while columns are vertical sets of element stacked side by side.

    For example, the array below has 3 rows and 5 columns.

     [[ 9 10 8 24 15] 
      [21 13 3 19 6] 
      [ 4 14 16 13 11]]
    
  3. 3-dimensional (3D) arrays: An array that has a 2-dimensional array nested in it, is called a 3-dimensional array.

Why are Arrays More Efficient?

Here are a few reasons arrays are more efficient than Python Lists:

  1. Arrays are homogeneous and help memory efficiency: Items in a python list can have various data types, such as strings and numbers. The variation of the items decreases memory efficiency.

    On the other hand, an array helps memory efficiency as it can only store one type of data at a time, such as floats or integers.

  2. Arrays are good for numerical operations: Since arrays store one kind of data type, it becomes easy to perform numerical operations on arrays, such as broadcasting and universal array operations, which we'll discuss later in this article.

  3. Arrays can have up to 3 dimensions: Python lists can only be in one dimension by default, but arrays can be in multi dimensions (length, width and layer or height).

Array Rows and Columns

Numpy Arrays


How to Install NumPy

NumPy can be installed on any computer that has a Python distribution. A tool like Anaconda distribution is a good way to get started, as it includes Python and NumPy.

Installing NumPy can be done in two ways. But first, go to your command line.

  • For Windows, do this by going to ‘’Start” and searching ‘’cmd” or “command line”.

  • For macOS, go to

Use either of these commands to install NumPy:

  1. The conda command (if you have the Anaconda Distribution):

  • In your command line, type conda install numpy

  • You're going to have something like C:\Users\Your Name> already, so just place your cursor in front of that and type.

  • Once installed, type import numpy as np

  1. The pip Command:

  • In your command line, type pip install numpy

  • Once installed, type import numpy as np

How to Create Arrays From Scratch

The following are practices that can be done in Jupyter notebook or IDEs such as VS Code.

Jupyter Notebook was used for all code samples in this write up, which is why the print function isn’t used. VS Code and other IDEs will likely need you to include the print function.

  1. Create arrays from lists or tuples

We can create arrays by converting lists or tuples to arrays

#For Lists
import numpy as np
my_list = [2,4,6,8,7]

np.array(my_list)

# For tuples
import numpy as np
my_tuple= (2,4,6,8,7)

np.array(my_tuple)

Output:

array([2, 4, 6, 8, 7])
💡
To import NumPy as np means importing numpy under the np alias. It’s a convention that is widely adopted in the Python Programming community.
  1. Create arrays using the arange function:

    The arange function works like Python range.

import numpy as np
np.arange(2,16)

Output:

array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])
💡
You do not have to “import numpy” every time. For most IDEs, you can write the code itself after importing numpy for the first time.

How to Create Arrays Using The Random Module

Since the random module produces random numbers, your output when trying out these code samples may vary from the ones displayed here. The seed function can be used to keep our output similar but we will not discuss that in this article.

Here are some common methods under the random module:

  1. rand: The rand method generates random floating numbers that are evenly distributed between 0 and 1.
np.random.rand (5)
#This will generate 5 random float numbers

Output:

array([0.03538478, 0.01721163, 0.06060251, 0.44540662, 0.59947798])
  1. randn: This generates random floats from a standard normal distribution centered around 0, not just between 0 and 1.
np.random.randn (3,5)

#Here's an alternative method:
from numpy.random import randn 
randn (3,5)

Output:

  array(([[-0.68817214, -0.75778515, -0.1957129 , -0.78179314, -1.0388488 ], 
          [2.3902402 , -1.06550236, -0.87783586, -1.29944137, -0.83802239], 
          [-0.54092031, 0.8832236 , -0.18598915, -0.66646982, 0.02545654]])

For rand and randn, here's the trick:

  • One number in parentheses represent the number of values you want.

  • Two numbers in the parentheses represent the matrix properties (the number of rows and columns, respectively).

  1. randint: This method returns random integers from a low to a high array. It can take a third argument for the number of steps (values that you want).
np.random.randint (20,40,5)

#Here's an alternative method:
from numpy.random import randint
randint(20,40,5)

Output:

array([31, 25, 36, 38, 27])

Other Useful Array Methods

  1. Ones: This generates an array that contains ones.

     np.ones((3, 2))
    

Output:

 array([[1., 1.],
       [1., 1.],
       [1., 1.]])
  1. Zeros: This generates an array that contains zeros.
np.zeros((3,5))

Output:

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])
  1. Linspace: This allows you to get evenly spaced values over a specified interval.

    It works like arange, but takes a third argument as the number of steps (values that you want).

np.linspace(0,30,5)

Output:

array([ 0. , 7.5, 15. , 22.5, 30. ])
  1. Identity Matrix: This matrix has its diagonal values as ones. It is usually a square matrix (equal number of rows and columns) and therefore, takes one argument.

     np.eye(7)
    

    Output:

array([[1., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 1.]])
  1. Max and Min: These are used to get the maximum/minimum value in your array.
my_array = np.random.randint(0,30,10)
my_array

#To get the minimum value:
my_array.min()

#To get the maximum value:
my_array.max()

Output:

#The created array (my_array)
array([ 7  8  3 17  1  5 29 14 14 22])

#The minimum value
1

#The Maximum value
29
  1. argmax/argmin: These are used to get the index of the max/min value in your array
my_array = np.random.randint(0,30,10)
my_array

#To get the index of the minimum value:
my_array.argmin()

#To get the index of the maximum value:
my_array.argmax()

Output:

array([23,  0,  2, 18, 28, 29, 17,  0, 21, 20])

1 

0

💡 Remember that when counting indexes, we start with zero not one.

  1. Reshape: This method is used to change the shape of the array.

💡Note: When reshaping an array, the number of items in the array must be equal to the dimensions of the matrix you’re reshaping them into.

In the code block below, note that the number of values is 15. The dimensions of the reshaped matrix are 3 and 5.

3 * 5 = 15.

my_array = np.random.randint(0,25,15)

my_array = my_array.reshape(3,5)
my_array

Output:

array([ 9 10 8 24 15 21 13 3 19 6 4 14 16 13 11]) #The array

array([[ 9 10 8 24 15] 
       [21 13 3 19 6]
       [ 4 14 16 13 11]] #The reshaped array
  1. Shape: This is used to get the shape of the array

     my_array.shape
    

    Output:

(3,5) #3 rows and 5 columns
  1. Dtype: This is used to check the data type of the array.

     my_array.dtype
    

    Output:

     ('int32')
    
💡
dtype and shape do not take parentheses because they are attributes. Attributes are accessed using “.attribute name“.

Array Arithmetic Operations

Array Arithmetic operations are mathematical operations that can be carried out on arrays, from basic ones like addition to complex ones such as matrix manipulations.

  1. Array-Scalar Operations: You can have array-scalar operations such as multiplying an array by a scalar value (eg 3). It is a form of broadcasting.

     first_array = np.arange(4,36,2)
     first_array
     first_array*3
    

    Output:

array([ 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34]) #the first_array

array([ 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96 102]) #Note that each element has now been multiplied by 3
  1. Array-Array Operations: These are array-on-array functions such as adding, subtracting, multiplying and dividing arrays.
import numpy as np
first_array = np.arange(4,36,2)
first_array #This command prints out the first array

second_array = np.arange(1,17)
second_array #This command prints out the second array


second_array-first_array #This command subtracts the first array from the second array

second_array/first_array #This command divides the second array by the first array

Output:

array([ 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34]) #The first_array

array([ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16]) #The second_array

([ -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13, -14, -15, -16, -17, -18]) #second_array-first_array

([0.25 , 0.33333333, 0.375 , 0.4 , 0.41666667, 0.42857143, 0.4375 , 0.44444444, 0.45 , 0.45454545, 0.45833333, 0.46153846, 0.46428571, 0.46666667, 0.46875 , 0.47058824])
#second_array/first_array

To perform array-array operations, both arrays have to be of the same shape. and have equal number of values.

  1. Universal Array Operations: You can perform universal operations on arrays, such as mean, log, sine, cosine etc. Most basic mathematical operations are already built into the NumPy array.
first_array = np.arange(4,36,2)
first_array
np.mean(first_array) # #This command finds the average of all the values in the first_array

Output:

array([ 4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34]) #The first_array

19.0 #The average of all the values in the first_array

Conclusion

NumPy is a useful Python library that is used to manipulate data and perform mathematical operations.

As you continue exploring, you’ll discover more advanced features that make NumPy a useful part of the Python programming language. Depending on your area of specialization, you may need to learn other Python libraries such as Pandas and Matplotlib. Keep practicing and keep growing.

References

Thank you for reading. If you find it useful, kindly share and do send feedback.

30
Subscribe to my newsletter

Read articles from Mariam Yusuff directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Mariam Yusuff
Mariam Yusuff

Hello, I'm CSS (Curious, Smart, Sharp-witted). I am Mariam, a technical writer with a knack for breaking down convoluted concepts into clearer bits.