Unit Testing, Then and Now: Algorithms, Mayhem, and LLMs

Welcome to another episode of “Why is this so complicated and why do I kind of love it?” Today’s main character: unit testing — a.k.a. the hero of clean code and mental breakdowns.
So… what even is Unit Test Generation?
Imagine your code is a moody teenager. And the unit tests are the overly prepared parents who are double checking that she won’t set the house on fire when left alone.
In layman language:
Unit test generation is the process of checking whether small parts of your code (called “units”) behave the way they should. You give some input, see if the output behaves as you want it to, or just jump out of the window if it doesn’t.
Now, unit test generation is what happens when you’re too tired (or smart) to write all those test cases manually. You let tools or algorithms generate those tests for you — ideally before things go in production.
Historically, this journey looked like:
Manual Testing: Devs wrote every single test. Great for control, horrible for sanity.
Search-Based Testing (e.g., Evolutionary Algorithms): Inspired by evolution — your code basically gets tested by natural selection.
LLMs + AI Agents: Now, we’re giving your code a study buddy with GPT-level powers that can write tests, explain them, and maybe question your coding life choices too.
Part 1: Of Mutants, Objectives, and Multi-Things — The Algorithm Era
Before AI started moonlighting as a test case author, we had something cooler (and scarier): search-based algorithms that basically covered your code like a battlefield sending in an army of test cases to survive, mutate, and conquer.
Today, we’re diving into three iconic names in the evolutionary test generation world: MOSA, DynaMOSA, and MOI — the alphabet soup that powered smarter unit tests before the LLM hype train.
MOSA: Where it all started.
What it is?
MOSA stands for Many-Objective Sorting Algorithm — and it brought in a revolution to test generation by saying:
“Hey, instead of optimizing one thing, why not start with all the things at once?”
In traditional unit test generation, you’d usually optimize for something like code coverage. And MOSA here said: “I will treat every single code branch as a separate objective.” And that turned unit testing into a multi-objective optimization problem.
How it works:
Objectives: Each branch/statement = its own objective. If your code has 50 branches, MOSA is solving a 50-objective optimization problem.
Population: Maintains a set of test cases (a.k.a. individuals in the evolutionary swarm).
Selection & Evolution: Uses concepts from genetic algorithms — selection, crossover, mutation — but instead of just “survival of the fittest,” it’s more like “survival of the most diverse and useful.”
Fitness Evaluation: Each test case is evaluated on how close it is to satisfying each objective (e.g., how close it is to executing a certain branch).
Result
A test suite that maximizes code coverage.
Scales well for large codebases with tons of branches.
Promotes diversity in test cases — not just “does it work?” but “does it work in all the weird paths?”
Became a baseline for many follow-up algorithms (like… yes, DynaMOSA).
But MOSA wasn’t perfect. It treated all objectives equally — even the ones that didn’t matter anymore (e.g., already covered). Which brings us to its dynamic upgrade…
DynaMOSA: The Chosen One.
What it is?
DynaMOSA (Dynamic Many-Objective Sorting Algorithm) is the cooler, more strategic evolution of MOSA. While MOSA blindly tried to optimize all objectives (i.e., cover all branches) all the time, DynaMOSA said:
“Let’s stop spending effort on what’s already done, and dynamically focus on the stuff that actually matters.”
How it works:
Many-Objective Framework: Like MOSA, DynaMOSA treats each branch or method condition in the code as a separate objective. But the key difference? It doesn’t treat them equally throughout the search.
Dynamic Objective Selection: At every generation (iteration of the evolutionary process), DynaMOSA filters out already-covered objectives. This drastically cuts down unnecessary evaluations.
Example: If objective O_k (say, a branch at line 21) is already fully covered by some test case, then it’s removed from the objective set — meaning future individuals don’t waste effort trying to cover it again.
Dependency-Based Target Expansion: DynaMOSA doesn’t just focus on uncovered branches in isolation. It uses a control dependency graph to identify dependent objectives — essentially, code regions that are only reachable if some prerequisite branches are covered.
So instead of “cover everything,” it becomes “what do I need to cover to reach something new?”
Fitness Evaluation Each test case (individual) is evaluated based on:
Branch distance: How close it is to triggering a condition.
Approach level: How deep into the control flow it went toward reaching the objective.
Heuristic-based secondary objectives: e.g., input diversity or method coverage for tiebreaking.
These are further combined using Pareto dominance based approach, ensuring that test cases are then compared in a way that balances multiple objectives.
Selection and Survival (NSGA-II inspired)
DynaMOSA borrows from NSGA-II, a popular multi-objective genetic algorithm, to:
Rank individuals by non-dominance
Apply crowding distance to maintain population diversity
Select top-ranked individuals for the next generation
Why DynaMOSA Was a Big Deal:
Faster Convergence: By ignoring already-covered paths, it avoids wasted evaluations.
Smarter Exploration: Dependency tracking lets it unlock complex paths that would otherwise be unreachable with brute-force.
Scalable: Handles large codebases better than MOSA, especially in real-world CI pipelines.
Implemented in EvoSuite and widely tested in empirical studies.
So yeah — DynaMOSA didn’t just test smarter, it adapted mid-flight.
Kind of like evolution, if evolution had a project deadline.
Alright. Time for the final boss of the trio: MOI — short for Many-Objective Improvement. It’s the quiet genius in the corner who doesn’t try to cover everything at once — just focuses on getting better with every move.
Time for the final boss of the trio : MOI.
MOI: Many-Objective Improvement
What It Is: MOI isn’t here to brute-force your code into submission like MOSA or DynaMOSA. Instead, it’s all about focused, incremental improvement. Rather than optimizing all objectives equally or dynamically pruning them, MOI picks one target at a time, and says:
“Let me improve this one thing, and I’ll do it better than anyone else.”
Think: targeted evolution, like sniper mode instead of shotgun blast.
How It Works:
Single Target Selection: MOI chooses a single uncovered test target (e.g., branch, method, or mutation) and works only on that. This prevents the multi-objective crowding problem MOSA and DynaMOSA are prone to sometimes. It doesn’t proceed until it achieves quantifiable progress (branch distance or approach level improvement).
Improvement-Based Evolutionary Loop: In contrast to regular generational algorithms, MOI is interested in optimizing the given individuals for the chosen target. It doesn’t bother about the others until it’s finished. Fitness is understood through how much closer the individual becomes to the target (typically through branch distance + approach level). Choice prefers those who have the best improvement delta, not pure raw coverage.
Objective Update Strategy: Only once MOI achieves the goal (i.e., covers the current target), it moves on to the next uncovered one. If no progress is being made for a fixed time (a.k.a. timeout or stagnation), it switches targets proactively — no stuck loops.
No Dominance Sorting: Unlike NSGA-II, MOI does not employ Pareto-dominance or multi-objective sorting because it is goal-oriented. Because of this, it is lighter, faster, and easier to handle — especially when there are a lot of goals.
Real Example Flow: Let’s say we have a class UserManager with 25 branches to test.
MOSA: Tries to cover all 25 at once.
DynaMOSA: Covers a few, removes them from the list, focuses on remaining — but still juggles multiple.
MOI: Picks branch #6. Generates tests. If a test case gets slightly closer to triggering #6, it survives. If it gets worse, it’s out.
Only after branch #6 is fully covered does MOI move on to #7.
It’s basically greedy, but in a smart, stateful way.
Why It Mattered:
Adapts flawlessly to large test targets (100s to 1000s of objectives).
Prevents the crowding effect, which occurs when too many goals undermine the pressure of selection.
Excellent for testing mutations or hard-to-cover branches where concentrated work yields better results.
Particularly useful when there are sparse rewards, where global search fails and improvements are infrequent.
Bonus Insight: MOI shares ideas with reinforcement learning in spirit — test cases are rewarded if they improve, even slightly. It’s not just about who wins, but who’s trying hardest and making progress.
So yeah — if DynaMOSA is your project manager optimizing sprint goals, MOI is the lone engineer who locks in on a bug and won’t sleep till it’s gone.
Conclusion: Goals, Algorithms, and Ordered Chaos There you have it: MOSA, MOI, and DynaMOSA. Three evolutionary approaches, one common goal: create unit tests that are more intelligent, quicker, and more significant without breaking the bank. Long before LLMs began writing test cases like they wrote sonnets, these algorithms established the foundation for our understanding of automated testing at scale, from brute-force multi-objective coverage to laser-focused incremental improvements.
However, this was only the prelude.
We’ll jump ahead to the present in Part 2: → LLMs, code-aware agents, and some tools like EvoSuite, Pynguin, NxtUnit and many more :)
When machines begin testing themselves, things will get strange, strong, and surprisingly poetic.
Until then, may your branches be covered and your claims never be proven incorrect.
— Ava ☕️
Subscribe to my newsletter
Read articles from Gazal Arora directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
