So this project started as something I thought would be cool — I wanted to take boring sorting algorithms like Bubble Sort or Merge Sort and actually visualize them in a way that’s clear and makes people go, “Ohhh, now I get it.”

But instead of just coding animations, I thought — what if I used deep learning to generate those animations? Could a model learn what sorting “looks like”? Could it visualize steps and structure based on data?

That’s where the idea came from: combine Vision Transformers and Conditional GANs to generate and visualize sorting behavior as image sequences.

🧩 The Problem I Wanted to Explore

Sorting algorithms are a classic part of CS, but most people struggle to intuitively understand how they work. Even when there are visualizations, they’re often hard-coded and rigid.

So I wanted to see if I could train a model to:

Understand the step-by-step logic of sorting
Generate images that represent each stage of the algorithm
Use attention (ViTs) and generation (GANs) to explain the algorithm visually

🔧 What I Used to Build It

Python
PyTorch for model building
Vision Transformers (ViT) – to encode position/state of elements
Conditional GANs – to generate sorting sequences
NumPy / OpenCV / Matplotlib – to render and save images
1000+ generated images across different sorts (QuickSort, BubbleSort, HeapSort, MergeSort)

🛠️ How I Built It

1️⃣ Dataset Generation

Simulated thousands of sorting steps across different algorithms
For each step, created a grayscale image where bar height = value, and x-position = index
Each algorithm had different “flow,” and I labeled images with algorithm type + step order

2️⃣ Training the Vision Transformer

Used ViTs to learn the visual structure and attention flow of sorting
Trained it to distinguish and classify which sorting algorithm a given image came from
This helped understand how ViT pays attention to swapped or sorted elements

3️⃣ Training the Conditional GAN

Generator: took algorithm + step number as condition, generated an image
Discriminator: checked if the generated image matched real visual behavior
Used image reconstruction loss + adversarial loss to improve sequence consistency

4️⃣ Putting It Together

For each algorithm, I fed in a condition like “step 5 of QuickSort”
The model would generate an image showing how the array looks at that step
When stitched together → full animation of the sorting process

📈 Results

40% better clarity when evaluated by peer testers (they found the visuals easier to interpret)
Generated sequences looked surprisingly natural — with clear progression of sorting stages
Attention heatmaps from ViT showed the model actually focused on unsorted regions first

💡 What I Learned

Sorting is deeply visual if you model it the right way — bar graphs + time = behavior
ViTs were great for understanding structure, while GANs were great at generating continuity
Combining generative AI with algorithmic logic opens up really cool use cases in ed-tech and learning tools

🧠 Why This Matters

This isn’t just about sorting. It’s about teaching machines how logic flows visually. If a model can learn how a process looks over time, it can apply the same idea to other domains like:

Behavioral modeling
Procedural decision trees
Even email thread evolution in tools like what Abnormal builds

✉️ Let’s Build Smarter EdTech or Vis-ML

I’d love to collaborate with folks working in AI for education, generative visual reasoning, or explainable AI.

📩 LinkedIn | 🔗 GitHub

🔍 Visualizing Sorting Algorithms with AI: Using Vision Transformers and GANs to Teach Logic