📌 Introduction

What is NumPy?
NumPy (Numerical Python) is a powerful Python library used for working with arrays, mathematical functions, and numerical data.

Why use NumPy?
It’s faster and more efficient than regular Python lists. It’s also the backbone of many data science and machine learning libraries like Pandas, TensorFlow, and Scikit-learn.

What will you learn?
In this post, you’ll learn:

How to install and import NumPy
How to create and use NumPy arrays
Basic operations like indexing, reshaping, and math with arrays

🧮 NumPy Array Creation — A Beginner-Friendly Guide

NumPy is a core Python library used in data science and machine learning. If you're working with numbers, arrays, or matrices — NumPy is your go-to.

In this post, we'll learn how to create different types of arrays in NumPy: from lists, with default values, sequences, and identity matrices.

🔹 1. Creating Arrays from Python Lists

Use np.array() to convert a Python list into a NumPy array.

import numpy as np

arr = np.array([1, 2, 3, 4])
print(arr)

#output
[1 2 3 4]

🔹 2. Creating Arrays with Default Values

NumPy offers shortcuts to create arrays filled with default values like zeros, ones, or any number you choose.

✅ `np.zeros(shape)`

Creates an array filled with 0s.

zeroes_array = np.zeros(3)
print(zeroes_array)

#output
[0. 0. 0.]

✅ `np.ones(shape)`

Creates an array filled with 1s.

ones_array = np.ones((2, 3))
print(ones_array)

#output
[[1. 1. 1.]
 [1. 1. 1.]]

✅ `np.full(shape, value)`

filled_array = np.full((2, 2), 7)
print(filled_array)

#output
[[7 7]
 [7 7]]

🔹 3. Creating Sequences with `np.arange()`

np.arange(start, stop, step) generates arrays with evenly spaced values.

arr = np.arange(1, 10, 2)
arr2 = np.arange(2, 10, 2)

print(arr)   # [1 3 5 7 9]
print(arr2)  # [2 4 6 8]

🔹 4. Creating Identity Matrices with `np.eye()`

An identity matrix is a square matrix with 1s on the diagonal and 0s elsewhere. Use np.eye() to create one.

identity_matrix = np.eye(3)
print(identity_matrix)

#output
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

📐 Understanding NumPy Array Properties and Operations

After creating arrays, it's important to explore what they contain and how they behave. Here are the most commonly used array attributes and operations in NumPy.

🔹 1. `.shape` – Array Shape (Rows, Columns)

import numpy as np

arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6]])

print(arr_2d.shape)

#📘  Output: (2, 3)
🧠 Tells you the number of rows and columns in the array.

🔹 2. `.size` – Total Number of Elements

arr = np.array([[10, 20, 30], [40, 50, 60]])
print(arr.size)

#📘 Output: 6
🧠 Tells you how many elements are in the entire array.

🔹 3. `.ndim` – Number of Dimensions

arr_1d = np.array([1, 2, 3])
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
arr_3d = np.array([[[1, 2], [3, 4], [5, 6]]])

print(arr_1d.ndim, arr_2d.ndim, arr_3d.ndim)

#📘 Output: 1 2 3
🧠 Shows how many dimensions (axes) the array has:

1D → Vector
2D → Matrix
3D → Tensor

🔹 4. `.dtype` – Data Type of Elements

arr = np.array([10, 20, 30.5, 40])
print(arr.dtype)

#📘 Output: float64
🧠 NumPy uses specific data types. This tells you whether the array stores int, float, etc.

🔹 5. Type Conversion (`astype()`)

Convert an array's data type using .astype().

arr = np.array([1.2, 2.5, 3.8])
print(arr.dtype)

int_arr = arr.astype(int)
print(int_arr)
print(int_arr.dtype)

#📘 Output:
float64
[1 2 3]
int64

🧠 Converts each element to integer (truncates decimals).

🔧 NumPy Arithmetic Operations

NumPy supports element-wise math operations directly on arrays — no loops needed!

arr = np.array([10, 20, 30])

print(arr + 2)   # Add 2 to every element
print(arr - 2)   # Subtract 2
print(arr * 2)   # Multiply by 2
print(arr ** 2)  # Square each element

#📘 Output:
[12 22 32]
[ 8 18 28]
[20 40 60]
[100 400 900]

📊 NumPy Statistical Functions

Useful for data analysis and scientific work.

arr = np.array([10, 20, 30, 40, 50])

print(np.sum(arr))   # Total
print(np.mean(arr))  # Average
print(np.min(arr))   # Minimum
print(np.max(arr))   # Maximum
print(np.std(arr))   # Standard Deviation
print(np.var(arr))   # Variance

#📘 Output:
150
30.0
10
50
14.14...
200.0

🔍 Indexing and Slicing in NumPy

Accessing and manipulating elements in an array is super easy with NumPy. Here's how you can index, slice, and filter arrays effectively.

🔹 1. Basic Indexing

Access individual elements using square brackets.

🧠 Note: Negative indices count from the end of the array

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

print(arr[0])   # First element → 10
print(arr[2])   # Third element → 30
print(arr[-1])  # Last element → 50

🔹 2. Indexing in 2D Arrays

Use [row, column] format.

arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6]])

print(arr_2d[0, 1])  # First row, second column → 2
print(arr_2d[1, 2])  # Second row, third column → 6

🔹 3. Slicing Arrays

Use [start:stop:step] to get a range of values.

arr = np.array([10, 20, 30, 40, 50, 60])

print(arr[1:5])    # Elements from index 1 to 4 → [20 30 40 50]
print(arr[:4])     # First 4 elements → [10 20 30 40]
print(arr[::2])    # Every second element → [10 30 50]
print(arr[::-1])   # Reverse the array → [60 50 40 30 20 10]

🧠 Remember:

start:end goes from start to end - 1
step=-1 reverses the array

🔹 4. Fancy Indexing

You can pass a list of indices to extract multiple elements.

arr = np.array([10, 20, 30, 40, 50, 60])

print(arr[[0, 2, 4]])  # → [10 30 50]

🔹 5. Boolean Indexing (Filtering)

Select elements based on a condition:

arr = np.array([10, 20, 30, 40, 50])

print(arr[arr > 25])  # → [30 40 50]

🧠 This is super useful in data cleaning, filtering, and masking operations.

📌 Summary

Use arr[index] for basic access
Use arr[start:stop:step] for slicing
Use arr[[i1, i2]] for fancy indexing
Use arr[condition] for filtering

These operations help you quickly select and manipulate array data — without any loops!

🔄 Reshaping and Flattening Arrays in NumPy

Sometimes you need to change the shape of your array — from 1D to 2D or vice versa. NumPy makes this easy using reshape(), flatten(), and ravel().

🔹 1. Reshaping Arrays with `.reshape()`

You can reshape a 1D array into a 2D matrix (or any other shape) if the total number of elements remains the same.

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])
reshaped_arr = arr.reshape(2, 3)
print(reshaped_arr)

#📘 Output:
[[1 2 3]
 [4 5 6]]

🧠 .reshape(rows, columns) rearranges the elements without changing the data.

🔹 2. Flattening Arrays

NumPy offers two main ways to flatten a multi-dimensional array into 1D:

✅ `flatten()` – Returns a copy

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr_2d.flatten())  # [1 2 3 4 5 6]

✅ `ravel()` – Returns a view

print(arr_2d.ravel())    # [1 2 3 4 5 6]

📌 Key Difference:

flatten() returns a new array (copy of data)
ravel() returns a view (modifying it may affect the original)

🎯 Summary

Function	What It Does	Notes
`reshape()`	Changes shape of array	Must match element count
`flatten()`	Flattens to 1D (copy)	Safe to modify
`ravel()`	Flattens to 1D (view)	Faster, but modifies original

🔧 Modifying NumPy Arrays

NumPy provides several functions to insert, append, delete, stack, and split arrays. These are useful for data preprocessing, matrix transformation, and more.

🔹 `np.insert()` – Insert Elements

arr = np.array([10, 20, 30, 40])
new_arr = np.insert(arr, 2, 100)
print(new_arr)  # [10 20 100 30 40]

For 2D arrays, use the axis parameter:

arr_2d = np.array([[1, 2], [3, 4]])
print(np.insert(arr_2d, 1, [5, 6], axis=0))  # Insert row
print(np.insert(arr_2d, 1, [5, 6], axis=1))  # Insert column

📌 axis=0 → row-wise, axis=1 → column-wise, axis=None flattens array first.

🔹 `np.append()` – Append Elements

arr = np.array([10, 20, 30])
new_arr = np.append(arr, [40, 50])
print(new_arr)  # [10 20 30 40 50]

🔹 `np.concatenate()` – Join Arrays

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print(np.concatenate((arr1, arr2)))  # [1 2 3 4 5 6]

You can also stack in 2D using axis=0 or axis=1.

🔹 `np.delete()` – Remove Elements

arr = np.array([10, 20, 30, 40])
new_arr = np.delete(arr, 0)
print(new_arr)  # [20 30 40]

2D Example:

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(np.delete(arr_2d, 0, axis=0))  # Delete first row

🔹 Stacking Arrays

np.vstack() – Vertical (row-wise) stack
np.hstack() – Horizontal (column-wise) stack

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print(np.vstack((arr1, arr2)))
print(np.hstack((arr1, arr2)))

🔹 Splitting Arrays

Split one array into multiple sub-arrays:

pythonCopyEditarr = np.array([10, 20, 30, 40, 50, 60])
print(np.split(arr, 2))  # Split into 2 equal parts

Also works with 2D arrays:

np.hsplit() → split horizontally (columns)
np.vsplit() → split vertically (rows)

✅ Summary Table

Function	Purpose
`np.insert()`	Insert elements into array
`np.append()`	Append elements
`np.concatenate()`	Combine arrays
`np.delete()`	Delete elements
`np.vstack()`	Stack arrays vertically
`np.hstack()`	Stack arrays horizontally
`np.split()`	Split array equally

🚀 NumPy Broadcasting – Fast Array Operations

❌ The Problem with Loops

In vanilla Python:

prices = [100, 200, 300]
discount = 10

final_prices = []
for price in prices:
    final_price = price - (price * discount / 100)
    final_prices.append(final_price)

print(final_prices)

✅ It works, but it's slow. Looping in Python isn't efficient for large data.

✅ NumPy Solution: Broadcasting

import numpy as np

prices = np.array([100, 200, 300])
discount = 10

final_prices = prices - (prices * discount / 100)
print(final_prices)  # [90. 180. 270.]

📌 Broadcasting allows NumPy to apply operations between arrays of different shapes without writing loops.

🔹 More Broadcasting Examples

Multiply all elements:

arr = np.array([100, 200, 300])
print(arr * 2)  # [200 400 600]

Add vector to matrix (row-wise broadcasting):

matrix = np.array([[1, 2, 3], [4, 5, 6]])
vector = np.array([10, 20, 30])
result = matrix + vector
print(result)

➡ NumPy automatically stretches the vector to match matrix shape.

⚠️ Example: Incompatible Shapes

arr1 = np.array([[1, 2, 3], [4, 5, 6]])  # shape (2, 3)
arr2 = np.array([1, 2])  # shape (2,)
result = arr1 + arr2  # ❌ ValueError: shapes (2,3) and (2,) not aligned

To fix it, make sure shapes are broadcast-compatible. (You could reshape arr2 to (2,1) or (1,3), depending on intent.)

🔁 Element-wise Operations – Python Lists vs NumPy Arrays

🐢 Native Python (with `zip` + list comprehension)

list1 = [1, 2, 3]
list2 = [4, 5, 6]

result = [x + y for x, y in zip(list1, list2)]
print(result)  # [5, 7, 9]

✅ It works, but it's not as readable and doesn't scale well for large data.

⚡ NumPy Makes It Easier

import numpy as np

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

result = arr1 + arr2
print(result)  # [5 7 9]

No zip
No loops
Clean, efficient, and fast

🧮 Scalar Operations

arr = np.array([10, 20, 30])
print(arr * 3)  # [30 60 90]

➡ NumPy applies the operation to each element automatically — thanks to broadcasting.

🧼 Cleaning Data with NaN and Inf in NumPy

📌 Detecting NaN (Not a Number)

import numpy as np

arr = np.array([1, 2, np.nan, 4, np.nan, 6])
print(np.isnan(arr))  # [False False  True False  True False]

np.isnan() returns a boolean array marking NaN values.

❗ Note:

print(np.nan == np.nan)  # False

You can't compare NaN with ==. Always use np.isnan().

🔄 Replacing NaN with a Value

cleaned_arr1 = np.nan_to_num(arr)  # Default replaces NaN with 0
cleaned_arr2 = np.nan_to_num(arr, nan=100)
print(cleaned_arr1)  # [  1.   2.   0.   4.   0.   6.]
print(cleaned_arr2)  # [  1.   2. 100.   4. 100.   6.]

⚠️ Detecting and Replacing Infinite Values

arr = np.array([1, 2, np.inf, 4, -np.inf, 6])
print(np.isinf(arr))  
# [False False  True False  True False]

You can also replace them using np.nan_to_num:

cleaned_arr = np.nan_to_num(arr, posinf=1000, neginf=-1000)
print(cleaned_arr)  # [   1.    2. 1000.    4. -1000.    6.]

✅ Summary

np.isnan() → Check for NaN
np.isinf() → Check for inf / -inf
np.nan_to_num() → Replace all problematic values in one go

This is critical for data cleaning in machine learning pipelines or large-scale numerical computation.

Beginner's Guide to NumPy: From Zero to Hero