Beginner's Guide to NumPy: From Zero to Hero

Harsh GohilHarsh Gohil
10 min read

๐Ÿ“Œ Introduction

What is NumPy?
NumPy (Numerical Python) is a powerful Python library used for working with arrays, mathematical functions, and numerical data.

Why use NumPy?
Itโ€™s faster and more efficient than regular Python lists. Itโ€™s also the backbone of many data science and machine learning libraries like Pandas, TensorFlow, and Scikit-learn.

What will you learn?
In this post, youโ€™ll learn:

  • How to install and import NumPy

  • How to create and use NumPy arrays

  • Basic operations like indexing, reshaping, and math with arrays

๐Ÿงฎ NumPy Array Creation โ€” A Beginner-Friendly Guide

NumPy is a core Python library used in data science and machine learning. If you're working with numbers, arrays, or matrices โ€” NumPy is your go-to.

In this post, we'll learn how to create different types of arrays in NumPy: from lists, with default values, sequences, and identity matrices.


๐Ÿ”น 1. Creating Arrays from Python Lists

Use np.array() to convert a Python list into a NumPy array.

import numpy as np

arr = np.array([1, 2, 3, 4])
print(arr)

#output
[1 2 3 4]

๐Ÿ”น 2. Creating Arrays with Default Values

NumPy offers shortcuts to create arrays filled with default values like zeros, ones, or any number you choose.

โœ… np.zeros(shape)

Creates an array filled with 0s.

zeroes_array = np.zeros(3)
print(zeroes_array)

#output
[0. 0. 0.]

โœ… np.ones(shape)

Creates an array filled with 1s.

ones_array = np.ones((2, 3))
print(ones_array)

#output
[[1. 1. 1.]
 [1. 1. 1.]]

โœ… np.full(shape, value)

filled_array = np.full((2, 2), 7)
print(filled_array)

#output
[[7 7]
 [7 7]]

๐Ÿ”น 3. Creating Sequences with np.arange()

np.arange(start, stop, step) generates arrays with evenly spaced values.

arr = np.arange(1, 10, 2)
arr2 = np.arange(2, 10, 2)

print(arr)   # [1 3 5 7 9]
print(arr2)  # [2 4 6 8]

๐Ÿ”น 4. Creating Identity Matrices with np.eye()

An identity matrix is a square matrix with 1s on the diagonal and 0s elsewhere. Use np.eye() to create one.

identity_matrix = np.eye(3)
print(identity_matrix)

#output
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

๐Ÿ“ Understanding NumPy Array Properties and Operations

After creating arrays, it's important to explore what they contain and how they behave. Here are the most commonly used array attributes and operations in NumPy.


๐Ÿ”น 1. .shape โ€“ Array Shape (Rows, Columns)

import numpy as np

arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6]])

print(arr_2d.shape)

#๐Ÿ“˜  Output: (2, 3)
๐Ÿง  Tells you the number of rows and columns in the array.

๐Ÿ”น 2. .size โ€“ Total Number of Elements

arr = np.array([[10, 20, 30], [40, 50, 60]])
print(arr.size)

#๐Ÿ“˜ Output: 6
๐Ÿง  Tells you how many elements are in the entire array.

๐Ÿ”น 3. .ndim โ€“ Number of Dimensions

arr_1d = np.array([1, 2, 3])
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
arr_3d = np.array([[[1, 2], [3, 4], [5, 6]]])

print(arr_1d.ndim, arr_2d.ndim, arr_3d.ndim)

#๐Ÿ“˜ Output: 1 2 3
๐Ÿง  Shows how many dimensions (axes) the array has:

1D โ†’ Vector
2D โ†’ Matrix
3D โ†’ Tensor

๐Ÿ”น 4. .dtype โ€“ Data Type of Elements

arr = np.array([10, 20, 30.5, 40])
print(arr.dtype)

#๐Ÿ“˜ Output: float64
๐Ÿง  NumPy uses specific data types. This tells you whether the array stores int, float, etc.

๐Ÿ”น 5. Type Conversion (astype())

Convert an array's data type using .astype().

arr = np.array([1.2, 2.5, 3.8])
print(arr.dtype)

int_arr = arr.astype(int)
print(int_arr)
print(int_arr.dtype)

#๐Ÿ“˜ Output:
float64
[1 2 3]
int64

๐Ÿง  Converts each element to integer (truncates decimals).

๐Ÿ”ง NumPy Arithmetic Operations

NumPy supports element-wise math operations directly on arrays โ€” no loops needed!

arr = np.array([10, 20, 30])

print(arr + 2)   # Add 2 to every element
print(arr - 2)   # Subtract 2
print(arr * 2)   # Multiply by 2
print(arr ** 2)  # Square each element

#๐Ÿ“˜ Output:
[12 22 32]
[ 8 18 28]
[20 40 60]
[100 400 900]

๐Ÿ“Š NumPy Statistical Functions

Useful for data analysis and scientific work.

arr = np.array([10, 20, 30, 40, 50])

print(np.sum(arr))   # Total
print(np.mean(arr))  # Average
print(np.min(arr))   # Minimum
print(np.max(arr))   # Maximum
print(np.std(arr))   # Standard Deviation
print(np.var(arr))   # Variance

#๐Ÿ“˜ Output:
150
30.0
10
50
14.14...
200.0

๐Ÿ” Indexing and Slicing in NumPy

Accessing and manipulating elements in an array is super easy with NumPy. Here's how you can index, slice, and filter arrays effectively.


๐Ÿ”น 1. Basic Indexing

Access individual elements using square brackets.

๐Ÿง  Note: Negative indices count from the end of the array

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

print(arr[0])   # First element โ†’ 10
print(arr[2])   # Third element โ†’ 30
print(arr[-1])  # Last element โ†’ 50

๐Ÿ”น 2. Indexing in 2D Arrays

Use [row, column] format.

arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6]])

print(arr_2d[0, 1])  # First row, second column โ†’ 2
print(arr_2d[1, 2])  # Second row, third column โ†’ 6

๐Ÿ”น 3. Slicing Arrays

Use [start:stop:step] to get a range of values.

arr = np.array([10, 20, 30, 40, 50, 60])

print(arr[1:5])    # Elements from index 1 to 4 โ†’ [20 30 40 50]
print(arr[:4])     # First 4 elements โ†’ [10 20 30 40]
print(arr[::2])    # Every second element โ†’ [10 30 50]
print(arr[::-1])   # Reverse the array โ†’ [60 50 40 30 20 10]

๐Ÿง  Remember:

  • start:end goes from start to end - 1

  • step=-1 reverses the array

๐Ÿ”น 4. Fancy Indexing

You can pass a list of indices to extract multiple elements.

arr = np.array([10, 20, 30, 40, 50, 60])

print(arr[[0, 2, 4]])  # โ†’ [10 30 50]

๐Ÿ”น 5. Boolean Indexing (Filtering)

Select elements based on a condition:

arr = np.array([10, 20, 30, 40, 50])

print(arr[arr > 25])  # โ†’ [30 40 50]

๐Ÿง  This is super useful in data cleaning, filtering, and masking operations.

๐Ÿ“Œ Summary

  • Use arr[index] for basic access

  • Use arr[start:stop:step] for slicing

  • Use arr[[i1, i2]] for fancy indexing

  • Use arr[condition] for filtering

These operations help you quickly select and manipulate array data โ€” without any loops!

๐Ÿ”„ Reshaping and Flattening Arrays in NumPy

Sometimes you need to change the shape of your array โ€” from 1D to 2D or vice versa. NumPy makes this easy using reshape(), flatten(), and ravel().


๐Ÿ”น 1. Reshaping Arrays with .reshape()

You can reshape a 1D array into a 2D matrix (or any other shape) if the total number of elements remains the same.

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])
reshaped_arr = arr.reshape(2, 3)
print(reshaped_arr)

#๐Ÿ“˜ Output:
[[1 2 3]
 [4 5 6]]

๐Ÿง  .reshape(rows, columns) rearranges the elements without changing the data.

๐Ÿ”น 2. Flattening Arrays

NumPy offers two main ways to flatten a multi-dimensional array into 1D:

โœ… flatten() โ€“ Returns a copy

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr_2d.flatten())  # [1 2 3 4 5 6]

โœ… ravel() โ€“ Returns a view

print(arr_2d.ravel())    # [1 2 3 4 5 6]

๐Ÿ“Œ Key Difference:

  • flatten() returns a new array (copy of data)

  • ravel() returns a view (modifying it may affect the original)

๐ŸŽฏ Summary

FunctionWhat It DoesNotes
reshape()Changes shape of arrayMust match element count
flatten()Flattens to 1D (copy)Safe to modify
ravel()Flattens to 1D (view)Faster, but modifies original

๐Ÿ”ง Modifying NumPy Arrays

NumPy provides several functions to insert, append, delete, stack, and split arrays. These are useful for data preprocessing, matrix transformation, and more.


๐Ÿ”น np.insert() โ€“ Insert Elements

arr = np.array([10, 20, 30, 40])
new_arr = np.insert(arr, 2, 100)
print(new_arr)  # [10 20 100 30 40]

For 2D arrays, use the axis parameter:

arr_2d = np.array([[1, 2], [3, 4]])
print(np.insert(arr_2d, 1, [5, 6], axis=0))  # Insert row
print(np.insert(arr_2d, 1, [5, 6], axis=1))  # Insert column

๐Ÿ“Œ axis=0 โ†’ row-wise, axis=1 โ†’ column-wise, axis=None flattens array first.

๐Ÿ”น np.append() โ€“ Append Elements

arr = np.array([10, 20, 30])
new_arr = np.append(arr, [40, 50])
print(new_arr)  # [10 20 30 40 50]

๐Ÿ”น np.concatenate() โ€“ Join Arrays

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print(np.concatenate((arr1, arr2)))  # [1 2 3 4 5 6]

You can also stack in 2D using axis=0 or axis=1.

๐Ÿ”น np.delete() โ€“ Remove Elements

arr = np.array([10, 20, 30, 40])
new_arr = np.delete(arr, 0)
print(new_arr)  # [20 30 40]

2D Example:

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(np.delete(arr_2d, 0, axis=0))  # Delete first row

๐Ÿ”น Stacking Arrays

  • np.vstack() โ€“ Vertical (row-wise) stack

  • np.hstack() โ€“ Horizontal (column-wise) stack

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print(np.vstack((arr1, arr2)))
print(np.hstack((arr1, arr2)))

๐Ÿ”น Splitting Arrays

Split one array into multiple sub-arrays:

pythonCopyEditarr = np.array([10, 20, 30, 40, 50, 60])
print(np.split(arr, 2))  # Split into 2 equal parts

Also works with 2D arrays:

  • np.hsplit() โ†’ split horizontally (columns)

  • np.vsplit() โ†’ split vertically (rows)

โœ… Summary Table

FunctionPurpose
np.insert()Insert elements into array
np.append()Append elements
np.concatenate()Combine arrays
np.delete()Delete elements
np.vstack()Stack arrays vertically
np.hstack()Stack arrays horizontally
np.split()Split array equally

๐Ÿš€ NumPy Broadcasting โ€“ Fast Array Operations

โŒ The Problem with Loops

In vanilla Python:

prices = [100, 200, 300]
discount = 10

final_prices = []
for price in prices:
    final_price = price - (price * discount / 100)
    final_prices.append(final_price)

print(final_prices)

โœ… It works, but it's slow. Looping in Python isn't efficient for large data.

โœ… NumPy Solution: Broadcasting

import numpy as np

prices = np.array([100, 200, 300])
discount = 10

final_prices = prices - (prices * discount / 100)
print(final_prices)  # [90. 180. 270.]

๐Ÿ“Œ Broadcasting allows NumPy to apply operations between arrays of different shapes without writing loops.

๐Ÿ”น More Broadcasting Examples

Multiply all elements:

arr = np.array([100, 200, 300])
print(arr * 2)  # [200 400 600]

Add vector to matrix (row-wise broadcasting):

matrix = np.array([[1, 2, 3], [4, 5, 6]])
vector = np.array([10, 20, 30])
result = matrix + vector
print(result)

โžก NumPy automatically stretches the vector to match matrix shape.

โš ๏ธ Example: Incompatible Shapes

arr1 = np.array([[1, 2, 3], [4, 5, 6]])  # shape (2, 3)
arr2 = np.array([1, 2])  # shape (2,)
result = arr1 + arr2  # โŒ ValueError: shapes (2,3) and (2,) not aligned

To fix it, make sure shapes are broadcast-compatible. (You could reshape arr2 to (2,1) or (1,3), depending on intent.)

๐Ÿ” Element-wise Operations โ€“ Python Lists vs NumPy Arrays

๐Ÿข Native Python (with zip + list comprehension)

list1 = [1, 2, 3]
list2 = [4, 5, 6]

result = [x + y for x, y in zip(list1, list2)]
print(result)  # [5, 7, 9]

โœ… It works, but it's not as readable and doesn't scale well for large data.

โšก NumPy Makes It Easier

import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

result = arr1 + arr2
print(result)  # [5 7 9]
  • No zip

  • No loops

  • Clean, efficient, and fast

๐Ÿงฎ Scalar Operations

arr = np.array([10, 20, 30])
print(arr * 3)  # [30 60 90]

โžก NumPy applies the operation to each element automatically โ€” thanks to broadcasting.

๐Ÿงผ Cleaning Data with NaN and Inf in NumPy

๐Ÿ“Œ Detecting NaN (Not a Number)

import numpy as np

arr = np.array([1, 2, np.nan, 4, np.nan, 6])
print(np.isnan(arr))  # [False False  True False  True False]

np.isnan() returns a boolean array marking NaN values.

โ— Note:

print(np.nan == np.nan)  # False

You can't compare NaN with ==. Always use np.isnan().

๐Ÿ”„ Replacing NaN with a Value

cleaned_arr1 = np.nan_to_num(arr)  # Default replaces NaN with 0
cleaned_arr2 = np.nan_to_num(arr, nan=100)
print(cleaned_arr1)  # [  1.   2.   0.   4.   0.   6.]
print(cleaned_arr2)  # [  1.   2. 100.   4. 100.   6.]

โš ๏ธ Detecting and Replacing Infinite Values

arr = np.array([1, 2, np.inf, 4, -np.inf, 6])
print(np.isinf(arr))  
# [False False  True False  True False]

You can also replace them using np.nan_to_num:

cleaned_arr = np.nan_to_num(arr, posinf=1000, neginf=-1000)
print(cleaned_arr)  # [   1.    2. 1000.    4. -1000.    6.]

โœ… Summary

  • np.isnan() โ†’ Check for NaN

  • np.isinf() โ†’ Check for inf / -inf

  • np.nan_to_num() โ†’ Replace all problematic values in one go

This is critical for data cleaning in machine learning pipelines or large-scale numerical computation.

0
Subscribe to my newsletter

Read articles from Harsh Gohil directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Harsh Gohil
Harsh Gohil