Mastering NumPy for Data Science: Matrix Operations, Statistics, and Masking

Irfan CodesIrfan Codes
1 min read

Table of contents

✍️ Intro:

Today I explored some of the most powerful features of NumPy, Python’s go-to library for numerical computing. From creating random matrices to performing matrix multiplications, calculating statistical measures, and masking data — NumPy offers a toolkit that every data scientist needs to master.

In this post, I’ll walk through key operations and share code snippets that can help streamline your data science workflows.


✅ Creating matrices

np.random.randint(1, 21, size=(5, 5))  # 5x5 random matrix  
np.arange(1, 17).reshape(4, 4)         # 4x4 matrix with 1-16

✅ Modifying matrices

np.fill_diagonal(arr, 0)               # Set diagonal to 0  
arr.flatten()                          # Flatten matrix

✅ Stats & normalization

mean = np.mean(array)  
median = np.median(array)  
std_dev = np.std(array)  
variance = np.var(array)  
normalized = (array - mean) / std_dev

✅ Matrix algebra

np.dot(array1, array2)                 # Matrix multiplication  
np.linalg.eigvals(matrix)              # Eigenvalues  
np.linalg.inv(matrix)                  # Inverse  
np.linalg.det(matrix)                  # Determinant

✅ Masking

np.ma.masked_greater(array, 10)        # Mask values > 10

👉 Conclusion:
NumPy unlocks powerful ways to manipulate and analyze data efficiently. These operations are foundational for deeper machine learning and data science tasks. Stay tuned for my next post as I dive into Pandas and data manipulation!




0
Subscribe to my newsletter

Read articles from Irfan Codes directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Irfan Codes
Irfan Codes