04 Introduction of a new optimization approach - Equality Saturation

The Problem with Traditional Optimization Methods
Traditional optimization follows a straightforward but limited approach: take in some data (like code), and apply a sequence of transformations one after another. Each transformation aims to improve the code in some way - perhaps making it faster, smaller, or more efficient.
However, this approach has a fundamental limitation: the results can vary based on the order in which transformation rules are applied. Different sequences of transformations often produce wildly different results, sometimes better, sometimes worse.
The Phase Ordering Problem
This issue is known in compiler theory as the Phase Ordering Problem. Consider the following scenario:
Transformation A followed by Transformation B might yield a 15% performance improvement
But Transformation B followed by Transformation A might yield a 25% improvement
While a completely different sequence might yield a 40% improvement
Compilers have no reliable way to determine the optimal order in advance. This leads to unpredictable optimization outcomes and leaves potential performance improvements undiscovered.
Local Profitability Heuristic
Traditional optimizers use what we call a "local profitability heuristic" - they make decisions about whether to apply a transformation based only on whether it seems beneficial at that immediate point. This approach:
Evaluates each transformation in isolation
Applies it if it appears immediately profitable
Cannot "see" future opportunities that might arise from temporarily taking a less optimal path
The key limitation here is that traditional optimization explores just a single path through the vast space of possible program transformations, missing potentially superior alternatives.
Why We Need a New Approach
What we really need is a method that recognizes that the original program and its transformed versions are equivalent in behavior, just expressed differently. An ideal approach would:
Explore multiple equivalent programs simultaneously
Avoid committing to a specific transformation too early
Consider the global impact of transformations
Find the optimal version from all possible equivalent programs
We need to explore the entire region of the space of equivalent programs rather than a single path through it.
Equality Saturation: Explore First, Decide Later
This is where Equality Saturation comes in. Rather than applying transformations sequentially and immediately committing to each one, equality saturation:
Explores all equivalent programs in the region by applying all possible transformations
Builds an e-graph (equality graph) that compactly represents all these equivalent programs
Uses a global profitability heuristic to select the best program from all discovered equivalents
The approach follows an "explore first, then decide" philosophy, which avoids premature commitment to any particular optimization path.
How Equality Saturation Works
Once we've explored the space of equivalent programs and built our e-graph, we determine which equivalent program is optimal according to our cost model. We then convert our selected program representation back into the desired form (such as a control flow graph).
Key Benefits of Equality Saturation
Our approach offers numerous advantages over traditional optimization:
Mitigates the phase ordering problem by considering all transformation sequences simultaneously
Enables translation validation by proving equivalence between original and optimized programs
Produces more optimal code by finding the globally best solution rather than a locally optimal one
Provides referential transparency through reuse of nodes in the e-graph, making the representation compact and efficient
Implementation of Equality Saturation
Implementing equality saturation involves several key components:
E-Graph Construction Using Axioms
PEG (Program Expression Graph) operator axioms - fundamental mathematical properties
Language-specific axioms - equivalences that hold true in the specific programming language
Domain-specific axioms - equivalences that apply to particular problem domains
Extraction Using a Pseudo-Boolean Solver
To extract the optimal program from the e-graph:
Assign a cost to each operation in the e-graph
Impose constraints ensuring the selected operations form a valid program
Use a solver to minimize the total cost while satisfying all constraints
reference: https://www.youtube.com/watch?v=hL2MARuBCzw
Subscribe to my newsletter
Read articles from Corrine directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
