LLVM Instruction Selection: Legalization Pass Deep Dive

Yuanbo LiYuanbo Li
6 min read

This article is a follow-up to my previous post LLVM Instruction Selection: SelectionDAG Building, where we explored how SelectionDAGs are constructed from LLVM IR. In this post, we explore the next phase: legalization*, which ensures the DAG conforms to target-specific constraints before instruction selection.*


Overview

Purpose

  • What: To transform a target-independent SelectionDAG into a DAG that contains only legal types and operations supported by the hardware.
  • Why: Real hardware doesn’t support exotic types (i65, f128) or operations (frem, ctlz, vselect) directly.

  • How: The legalizer rewrites or expands these into lower-level equivalents via type splitting, promotion, expansion, or target-specific hooks.

High-Level Behavior

The legalization takes input as a SelectionDAG that may contain illegal types or operations, such as i65 add, frem, or vselect. The goal is to transform this into a legal DAG that includes only hardware-supported types and operations, or ones that the target can lower using LowerOperation. This legalization process occurs in two main phases: Type Legalization, which is handled by the DAGTypeLegalizer, and Operation Legalization, which is managed by the SelectionDAGLegalize.

Why Illegal DAG in the First Place?

What Does “Illegal” Mean in SelectionDAG? In LLVM’s SelectionDAG:

  • An illegal type is one that the target does not natively support (e.g., i128, f16, <3 x i32>).

  • An illegal operation is one that lacks direct target instruction support (e.g., frem, vselect, ctlz, udiv).

These nodes cannot be directly selected into machine instructions and must be legalized.

Why Do Illegal DAG Nodes Exist at All?

LLVM intentionally allows illegal DAG nodes in the pre-legalized SelectionDAG for several technical and design reasons:

  • IR and DAG Are Initially Target-Independent

    • LLVM IR and early SelectionDAG stages are designed to be hardware-agnostic, enabling:

      • Global optimizations before committing to hardware details

      • Easier reasoning over high-level semantics (e.g., i128, frem, vselect.

  • Better optimization

    • Think of early DAGs as expressing the ideal computation, not yet bound to real hardware constraints. Thus it is easier to optimize.

Key Phases

LLVM instruction selection performs legalization in three main phases:

1. CodeGenAndEmitDAG()
   ├── 2. LegalizeTypes() → DAGTypeLegalizer
   │    ├── 3. Scalar type checks
   │    └── 4. Type expansion/promotion
   └── 5. LegalizeVectors() → DAGVectorLegalizer
   │    ├── 6. Vector splitting
   │    └── 7. Cross-lane ops
   │── LegalizeTypes()
   ├── 8. Legalize() → SelectionDAGLegalize
   │    ├── 9. Operation lowering
   │    └── 10. Target hooks (e.g. AMDGPU)

1. Type Legalization

Converts all types in the DAG into types natively supported by the target. It is a worklist based algorithm to legalize all types in the SelectionDAG

File: llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp

Below is the summarized behavior of type legalization. It is a worklist based algorithm to process nodes to legalize types

1. DAGTypeLegalizer::run()
   ├── 2. Init Dummy Root Handle
   │    └── Prevent root deletion, track changes
   │
   ├── 3. Classify All Nodes
   │    ├── Leaf → ReadyToProcess + Worklist
   │    └── Others → Unanalyzed
   │
   ├── 4. While Worklist Not Empty:
   │    ├── 5. Pop Node N from Worklist
   │    │
   │    ├── 6. Legalize Result Types (Top-down)
   │    │    ├── PromoteIntegerResult()
   │    │    ├── ExpandIntegerResult()
   │    │    ├── SoftenFloatResult()
   │    │    └── etc.
   │    │
   │    ├── 7. Legalize Operand Types (Bottom-up)
   │    │    ├── PromoteIntegerOperand()
   │    │    ├── ExpandIntegerOperand()
   │    │    ├── ScalarizeVectorOperand()
   │    │    └── etc.
   │    │
   │    ├── 8. If Node Morphs:
   │    │    └── Replace all values via ReplaceValueWith()
   │    │
   │    └── 9. Mark Node Processed + Notify Users
   │         ├── Users now have one fewer unprocessed operand
   │         └── Push to worklist if all inputs ready
   │
   ├── 10. Cleanup
   │     ├── Restore DAG Root
   │     └── RemoveDeadNodes()
   │
   └── 11. Debug Checks (optional)
        ├── All result/operand types legal?
        └── All nodes properly marked Processed?
  • Promote: Widen narrow types (e.g., i1i32)

  • Expand: Break wide types (e.g., i128{i64, i64})

  • Soften: Turn floating-point types into integer-based representations

  • Scalarize: Break illegal vectors into scalars

  • Split: Decompose unsupported vectors (e.g., <3 x i32><2 x i32> + i32)

2. Operation Legalization

Converts operations that the target cannot directly execute into legal equivalents. Ensures that all DAG opcodes are legal (e.g., frem, vselect, bswap, etc.). Without this phase, ISel cannot emit real machine instructions. It handles both normal and vector ops

Operation Legalization
File: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

High-level Behavior Summary

SelectionDAG::Legalize()
   ├── 1. AssignTopologicalOrder()
   │    └── Enables reverse traversal of DAG nodes
   │
   ├── 2. Init LegalizedNodes and DeleteListener
   │    └── Tracks visited nodes and handles deletion safely
   │
   ├── 3. Init SelectionDAGLegalize Legalizer
   │
   ├── 4. While AnyLegalized:
   │    └── For Node N in reverse topological order:
   │         ├── Skip dead nodes (DeleteNode)
   │         ├── If N not yet legalized:
   │         │    └── Legalizer.LegalizeOp(N)
   │         └── If N becomes dead → Delete
   │
   ├── 5. Break if no new nodes were legalized
   │
   └── 6. RemoveDeadNodes()


SelectionDAGLegalize::LegalizeOp(Node)
   ├── 1. Skip TargetConstant / Register nodes
   │    └── These are already hardware-level and need no legalization
   │
   ├── 2. Debug assertion: all types must be legalized already
   │
   ├── 3. Dispatch per opcode to determine LegalizeAction
   │    ├── Opcode-specific logic: use operand/result VT depending on case
   │    ├── Handles many opcodes specially (e.g., SELECT_CC, VAARG, etc.)
   │    └── Sets both `Action` and `SimpleFinishLegalizing` flags
   │
   ├── 4. If SimpleFinishLegalizing:
   │    ├── Special-case shift/rotate adjustment
   │    ├── Possibly update operands (e.g., mask shift amount)
   │    │
   │    └── Switch (Action):
   │         ├── Legal   → return (already OK)
   │         ├── Custom  → call LowerOperation() → ReplaceNode()
   │         ├── Expand  → call ExpandNode()
   │         ├── LibCall → ConvertNodeToLibcall()
   │         └── Promote → PromoteNode()
   │
   └── 5. Otherwise (SimpleFinishLegalizing == false):
        └── Switch (Opcode):
             ├── ISD::LOAD  → LegalizeLoadOps()
             ├── ISD::STORE → LegalizeStoreOps()
             └── Others     → llvm_unreachable("unknown op")

LowerOperation: provide legalization with target's native operations: pow(x, y) → exp2(y * log2(x))

  • ExpandNode(): Provides a target-independent fallback when the operation is neither legal nor custom-lowered: urem(x, y) → x - (x / y) * y

  • promoteNode(): Handles operations on types that are too small or unsupported, by promoting them to a larger legal type.

    • potential example: add_i8(a, b) -> trunc_i8(zext_i32(a) + zext_i32(b))
  • ConvertNodeToLibcall(): Legalizes a SelectionDAG node by replacing it with a call to a standard library function (a "libcall").

    • frem(x, y) -> call double @fmod(double %x, double %y)

3 Vector Legalization

Handles vector operations or types that are illegal after type legalization. Many vector ops like vselect, shuffle, or <3 x i32> still need lowering for targets with limited SIMD support.

1. Check if DAG contains any vector types
   └── If no vectors → skip legalization
2. Assign topological order to DAG
   └── Ensures legalizing operands before users
3. For each node in topological order:
   └── Call `LegalizeOp()` on the result value
4. Remove dead nodes

Mostly still delegate to Legalize Types and Legalize Ops. It is needed as a separate pass because after it runs, some types may become illegal again.


Thanks for reading! Feel free to reach out if you want to discuss the topics written here and want to suggest new interesting topics.

0
Subscribe to my newsletter

Read articles from Yuanbo Li directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Yuanbo Li
Yuanbo Li