🔍 Visualizing Sorting Algorithms with AI: Using Vision Transformers and GANs to Teach Logic

So this project started as something I thought would be cool — I wanted to take boring sorting algorithms like Bubble Sort or Merge Sort and actually visualize them in a way that’s clear and makes people go, “Ohhh, now I get it.”
But instead of just coding animations, I thought — what if I used deep learning to generate those animations? Could a model learn what sorting “looks like”? Could it visualize steps and structure based on data?
That’s where the idea came from: combine Vision Transformers and Conditional GANs to generate and visualize sorting behavior as image sequences.
🧩 The Problem I Wanted to Explore
Sorting algorithms are a classic part of CS, but most people struggle to intuitively understand how they work. Even when there are visualizations, they’re often hard-coded and rigid.
So I wanted to see if I could train a model to:
Understand the step-by-step logic of sorting
Generate images that represent each stage of the algorithm
Use attention (ViTs) and generation (GANs) to explain the algorithm visually
🔧 What I Used to Build It
Python
PyTorch for model building
Vision Transformers (ViT) – to encode position/state of elements
Conditional GANs – to generate sorting sequences
NumPy / OpenCV / Matplotlib – to render and save images
1000+ generated images across different sorts (QuickSort, BubbleSort, HeapSort, MergeSort)
🛠️ How I Built It
1️⃣ Dataset Generation
Simulated thousands of sorting steps across different algorithms
For each step, created a grayscale image where bar height = value, and x-position = index
Each algorithm had different “flow,” and I labeled images with algorithm type + step order
2️⃣ Training the Vision Transformer
Used ViTs to learn the visual structure and attention flow of sorting
Trained it to distinguish and classify which sorting algorithm a given image came from
This helped understand how ViT pays attention to swapped or sorted elements
3️⃣ Training the Conditional GAN
Generator: took algorithm + step number as condition, generated an image
Discriminator: checked if the generated image matched real visual behavior
Used image reconstruction loss + adversarial loss to improve sequence consistency
4️⃣ Putting It Together
For each algorithm, I fed in a condition like “step 5 of QuickSort”
The model would generate an image showing how the array looks at that step
When stitched together → full animation of the sorting process
📈 Results
40% better clarity when evaluated by peer testers (they found the visuals easier to interpret)
Generated sequences looked surprisingly natural — with clear progression of sorting stages
Attention heatmaps from ViT showed the model actually focused on unsorted regions first
💡 What I Learned
Sorting is deeply visual if you model it the right way — bar graphs + time = behavior
ViTs were great for understanding structure, while GANs were great at generating continuity
Combining generative AI with algorithmic logic opens up really cool use cases in ed-tech and learning tools
🧠 Why This Matters
This isn’t just about sorting. It’s about teaching machines how logic flows visually. If a model can learn how a process looks over time, it can apply the same idea to other domains like:
Behavioral modeling
Procedural decision trees
Even email thread evolution in tools like what Abnormal builds
✉️ Let’s Build Smarter EdTech or Vis-ML
I’d love to collaborate with folks working in AI for education, generative visual reasoning, or explainable AI.
Subscribe to my newsletter
Read articles from Khushal Jhaveri directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
