📜 K-Means Clustering in Python: A Step-by-Step Guide for Beginners

🚀 Introduction
Hey there! 👋
If you’ve ever heard the term “Clustering” and wondered what it really means, you’re in the right place. Today, I’m going to walk you through a hands-on project where we’ll implement the K-Means Clustering algorithm from scratch using Python. And don’t worry—I'll guide you step by step, just like I would if we were sitting together with laptops open! 💻☕
We’ll be using the famous Iris dataset and will write every line of code ourselves to truly understand how clustering works. Let's dive in!
🔧 Step 1: Install Python and pip
Let’s make sure Python and pip are installed.
Command:
sudo apt update
sudo apt install python3
sudo apt install python3-pip
Now, check versions to confirm installation.
Command:
python3 --version
pip3 --version
🔧 Step 2: Create a New Python File
In the terminal, go to the folder where you want to save the file and create the new file.
Command: nano kmeans.py
This will open a blank file in the terminal. Keep it open — we’ll paste code into it soon!
🔧 Step 3: Install Required Libraries
Before writing the code, let’s install the Python libraries we’ll need.
Command: sudo apt install python3-numpy python3-pandas python3-matplotlib python3-sklearn
These libraries will help us with:
NumPy – for math operations
Pandas – for handling datasets
Matplotlib – for plotting
Scikit-learn – for loading the Iris dataset
🔧 Step 4: Paste the Code (In Your File)
To keep things clean and beginner-friendly, I’ve uploaded the complete Python code to my GitHub repository. You can download or clone it directly:
📂 GitHub Repo → 👉 https://github.com/unaizanouman/K_means-Clustering
🔧 Step 5: Run Your Python File
Now run the file using:
Command: python3 kmeans.py
🎉 Boom! A colorful scatter plot will pop up showing three clusters and their red centroids!
🧠 What You Learned
You just:
✅ Loaded and explored the Iris dataset
✅ Implemented K-Means Clustering from scratch
✅ Understood how Euclidean Distance works
✅ Visualized clusters using Matplotlib
✅ Ran Python code confidently from the terminal
And the best part? You didn’t use any ready-made clustering model — you coded it yourself. That’s the real power of learning! 💪
I hope this project made you feel more confident working with Python and algorithms.
📌 Stay tuned for my upcoming posts, where I’ll explore real-world datasets, use scikit-learn models, and show how clustering is used in practical applications.
📬 Questions? Suggestions? Just Wanna Say Hi?
Feel free to reach out!
📧 unaizaray@gmail.com
I'd love to hear your feedback or help if you get stuck!
Subscribe to my newsletter
Read articles from Unaiza Nouman directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Unaiza Nouman
Unaiza Nouman
👩💻 Unaiza Nouman 🎓 CS Student @ COMSATS | 💡 Data Science Enthusiast | 🛠️ Software Developer Curious mind with a passion for building smart, scalable solutions. Exploring the world of: 🐍 Python (Pandas, NumPy) | 📊 Power BI | 🧠 Machine Learning 🧮 SQL Server | ☕ Java | 💻 C++ | 📞 VoIP (Asterisk) 🧵 DSA | 🐧 Linux | 💭 Problem Solving I write to learn, build to grow, and share to inspire. Let’s turn lines of code into something meaningful 🚀