Mastering NumPy for Data Science: Matrix Operations, Statistics, and Masking

Table of contents
✍️ Intro:
Today I explored some of the most powerful features of NumPy, Python’s go-to library for numerical computing. From creating random matrices to performing matrix multiplications, calculating statistical measures, and masking data — NumPy offers a toolkit that every data scientist needs to master.
In this post, I’ll walk through key operations and share code snippets that can help streamline your data science workflows.
✅ Creating matrices
np.random.randint(1, 21, size=(5, 5)) # 5x5 random matrix
np.arange(1, 17).reshape(4, 4) # 4x4 matrix with 1-16
✅ Modifying matrices
np.fill_diagonal(arr, 0) # Set diagonal to 0
arr.flatten() # Flatten matrix
✅ Stats & normalization
mean = np.mean(array)
median = np.median(array)
std_dev = np.std(array)
variance = np.var(array)
normalized = (array - mean) / std_dev
✅ Matrix algebra
np.dot(array1, array2) # Matrix multiplication
np.linalg.eigvals(matrix) # Eigenvalues
np.linalg.inv(matrix) # Inverse
np.linalg.det(matrix) # Determinant
✅ Masking
np.ma.masked_greater(array, 10) # Mask values > 10
👉 Conclusion:
NumPy unlocks powerful ways to manipulate and analyze data efficiently. These operations are foundational for deeper machine learning and data science tasks. Stay tuned for my next post as I dive into Pandas and data manipulation!
Subscribe to my newsletter
Read articles from Irfan Codes directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
