🚀 Introduction

Hey there! 👋
If you’ve ever heard the term “Clustering” and wondered what it really means, you’re in the right place. Today, I’m going to walk you through a hands-on project where we’ll implement the K-Means Clustering algorithm from scratch using Python. And don’t worry—I'll guide you step by step, just like I would if we were sitting together with laptops open! 💻☕

We’ll be using the famous Iris dataset and will write every line of code ourselves to truly understand how clustering works. Let's dive in!

🔧 Step 1: Install Python and pip

Let’s make sure Python and pip are installed.

Command:

sudo apt update

sudo apt install python3

sudo apt install python3-pip

Now, check versions to confirm installation.

Command:

python3 --version

pip3 --version

🔧 Step 2: Create a New Python File

In the terminal, go to the folder where you want to save the file and create the new file.

Command: nano kmeans.py

This will open a blank file in the terminal. Keep it open — we’ll paste code into it soon!

🔧 Step 3: Install Required Libraries

Before writing the code, let’s install the Python libraries we’ll need.

Command: sudo apt install python3-numpy python3-pandas python3-matplotlib python3-sklearn

These libraries will help us with:

NumPy – for math operations
Pandas – for handling datasets
Matplotlib – for plotting
Scikit-learn – for loading the Iris dataset

🔧 Step 4: Paste the Code (In Your File)

To keep things clean and beginner-friendly, I’ve uploaded the complete Python code to my GitHub repository. You can download or clone it directly:

📂 GitHub Repo → 👉 https://github.com/unaizanouman/K_means-Clustering

🔧 Step 5: Run Your Python File

Now run the file using:

Command: python3 kmeans.py

🎉 Boom! A colorful scatter plot will pop up showing three clusters and their red centroids!

🧠 What You Learned

You just:

✅ Loaded and explored the Iris dataset

✅ Implemented K-Means Clustering from scratch

✅ Understood how Euclidean Distance works

✅ Visualized clusters using Matplotlib

✅ Ran Python code confidently from the terminal

And the best part? You didn’t use any ready-made clustering model — you coded it yourself. That’s the real power of learning! 💪

I hope this project made you feel more confident working with Python and algorithms.

📌 Stay tuned for my upcoming posts, where I’ll explore real-world datasets, use scikit-learn models, and show how clustering is used in practical applications.

📬 Questions? Suggestions? Just Wanna Say Hi?

Feel free to reach out!

📧 unaizaray@gmail.com

I'd love to hear your feedback or help if you get stuck!

📜 K-Means Clustering in Python: A Step-by-Step Guide for Beginners