🔍 Visualizing Sorting Algorithms with AI: Using Vision Transformers and GANs to Teach Logic

Khushal JhaveriKhushal Jhaveri
3 min read

So this project started as something I thought would be cool — I wanted to take boring sorting algorithms like Bubble Sort or Merge Sort and actually visualize them in a way that’s clear and makes people go, “Ohhh, now I get it.”

But instead of just coding animations, I thought — what if I used deep learning to generate those animations? Could a model learn what sorting “looks like”? Could it visualize steps and structure based on data?

That’s where the idea came from: combine Vision Transformers and Conditional GANs to generate and visualize sorting behavior as image sequences.


🧩 The Problem I Wanted to Explore

Sorting algorithms are a classic part of CS, but most people struggle to intuitively understand how they work. Even when there are visualizations, they’re often hard-coded and rigid.

So I wanted to see if I could train a model to:

  1. Understand the step-by-step logic of sorting

  2. Generate images that represent each stage of the algorithm

  3. Use attention (ViTs) and generation (GANs) to explain the algorithm visually


🔧 What I Used to Build It

  • Python

  • PyTorch for model building

  • Vision Transformers (ViT) – to encode position/state of elements

  • Conditional GANs – to generate sorting sequences

  • NumPy / OpenCV / Matplotlib – to render and save images

  • 1000+ generated images across different sorts (QuickSort, BubbleSort, HeapSort, MergeSort)


🛠️ How I Built It

1️⃣ Dataset Generation

  • Simulated thousands of sorting steps across different algorithms

  • For each step, created a grayscale image where bar height = value, and x-position = index

  • Each algorithm had different “flow,” and I labeled images with algorithm type + step order

2️⃣ Training the Vision Transformer

  • Used ViTs to learn the visual structure and attention flow of sorting

  • Trained it to distinguish and classify which sorting algorithm a given image came from

  • This helped understand how ViT pays attention to swapped or sorted elements

3️⃣ Training the Conditional GAN

  • Generator: took algorithm + step number as condition, generated an image

  • Discriminator: checked if the generated image matched real visual behavior

  • Used image reconstruction loss + adversarial loss to improve sequence consistency

4️⃣ Putting It Together

  • For each algorithm, I fed in a condition like “step 5 of QuickSort”

  • The model would generate an image showing how the array looks at that step

  • When stitched together → full animation of the sorting process


📈 Results

  • 40% better clarity when evaluated by peer testers (they found the visuals easier to interpret)

  • Generated sequences looked surprisingly natural — with clear progression of sorting stages

  • Attention heatmaps from ViT showed the model actually focused on unsorted regions first


💡 What I Learned

  • Sorting is deeply visual if you model it the right way — bar graphs + time = behavior

  • ViTs were great for understanding structure, while GANs were great at generating continuity

  • Combining generative AI with algorithmic logic opens up really cool use cases in ed-tech and learning tools


🧠 Why This Matters

This isn’t just about sorting. It’s about teaching machines how logic flows visually. If a model can learn how a process looks over time, it can apply the same idea to other domains like:

  • Behavioral modeling

  • Procedural decision trees

  • Even email thread evolution in tools like what Abnormal builds


✉️ Let’s Build Smarter EdTech or Vis-ML

I’d love to collaborate with folks working in AI for education, generative visual reasoning, or explainable AI.

📩 LinkedIn | 🔗 GitHub

0
Subscribe to my newsletter

Read articles from Khushal Jhaveri directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Khushal Jhaveri
Khushal Jhaveri