Type-Stability with SnoopCompile.jl and Cthulhu.jl for High-Performance Julia


This post was written by Steven Whitaker.
The Julia programming language is a high-level language that boasts the ability to achieve C-like speeds. Julia can run fast, despite being a dynamic language, because it is compiled and has smart type-inference.
Type-inference is the process of the Julia compiler reasoning about the types of objects, enabling compilation to create efficient machine code for the types at hand.
However, Julia code can be written in a way that prevents type-inference from succeeding---specifically, by writing type-unstable functions. (I'll explain type-instability later on.) When type-inference fails, Julia has to compile generic machine code that can handle any type of input, sacrificing the C-like performance and instead running more like an interpreted language such as Python.
Fortunately, there are tools that Julia developers can use to track down what code causes type-inference to fail. Among the most powerful of these tools are SnoopCompile.jl and Cthulhu.jl. Using these tools, developers can fix type-inference failures and restore the C-like performance they were hoping to achieve.
In this post, we will learn about type-stability and how it impacts performance. Then we will see how to use SnoopCompile.jl and Cthulhu.jl to locate and resolve type-instabilities.
Type-Stability
A function is type-stable if the type of the function's output can be concretely determined given the types of the inputs to the function, without any runtime information.
To illustrate, consider the following function methods:
f(x::Int) = "stable"
f(x::Float64) = rand(Bool) ? 1 : 2.0
In this example,
if we call f(x)
where x
is an Int
,
the compiler can figure out
that the output will be a String
without knowing the value of x
,
so f(x::Int)
is type-stable.
In other words,
it doesn't matter whether x
is 1
, -1
, or 176859431
;
the return value will always be a String
if x
is an Int
.
On the other hand,
if we call f(x)
where x
is a Float64
,
the compiler doesn't know
whether the output will be an Int
or a Float64
because that depends on the result of rand(Bool)
,
which is computed at runtime.
Therefore,
f(x::Float64)
is type-unstable.
Here's a more subtle example of type-instability:
function g(x)
if x < 0
return 0
else
return x
end
end
In this example,
g(x)
is type-unstable
because the output will either be an Int
or whatever the type of x
is,
and it all depends on the value of x
,
which isn't known at compile time.
(Note, however,
that g(x)
is type-stable
if x
is an Int
because then both branches of the if
statement
return the same type of value.)
And sometimes a function that might look type-stable can be type-unstable depending on the input. For example:
h(x::Array) = x[1] + 1
In this case,
h([1])
is type-stable,
but h(Any[1])
is not.
Why?
Because with h([1])
,
x
is a Vector{Int}
,
so the compiler knows
that the type of x[1]
will be Int
.
On the other hand,
with h(Any[1])
,
x
is a Vector{Any}
,
so the compiler thinks
x[1]
could be of any type.
To reiterate: a function is type-stable if the compiler can figure out the concrete type of the return value given only the types of the inputs, without any runtime information.
When Compilation Occurs
Another aspect of type-inference that is useful to understand is when compilation (including type-inference) occurs.
In a static language like C, an entire program is compiled before any code runs. This is possible because the types of all variables are known in advance, so machine code specific to those types can be generated in advance.
In an interpreted language like Python, no code is ever compiled because variables are dynamic, meaning their types aren't really ever known until variables are actually used (i.e., during runtime).
Julia programs can lie pretty much anywhere between the extremes of C and Python, and where on that spectrum a program lies depends on type-stability.
In a just-in-time (JIT) compiled language like Julia, compilation occurs once types are known.
- If a Julia program is completely type-stable, type-inference can figure out the types of all variables in the program before running any code. As a result, the entire program can be compiled as if it were written in a static language. This is what allows Julia to achieve C-like speeds.
- If a Julia program is entirely type-unstable, every function has to be compiled individually. In this case, compilation occurs at the moment the function is called because that's when the runtime information of all the input types is finally known. Furthermore, the machine code for a type-unstable function cannot be efficient because it must be able to handle a wide range of potential types. As a result, despite being compiled, the code runs essentially like an interpreted language.
Running a Julia program with type-instabilities is like driving down the street and hitting all the red lights. Julia will compile all the code for which type-inference succeeds and then start running. But when the program reaches a function call that could not be inferred, that's like a car stopping at a red light; Julia stops running the code to compile the function call now that it knows the runtime types of the inputs. After the function is compiled, the program can continue execution, like how the car can continue driving once the light turns green.
Type-Stability and Performance
As this analogy implies, and as I've stated before, type-stability has performance implications. Type-instabilities can cause various performance degradations, including:
- Dynamic (aka runtime) dispatch. If the compiler knows the input types to a function, the generated machine code can include a call to the specific method determined by those types. But if the compiler doesn't know those types, the machine code has to include instructions to perform dynamic dispatch. As a result, rather than jumping directly to the correct method, Julia has to spend runtime CPU cycles to look up the correct method to call.
- Increased memory allocations. If the compiler doesn't know what type a variable will have, it's impossible to put it in a register or even allocate stack space for it. As a result, it has to be heap-allocated and managed by the garbage collector.
- Suboptimal compiled code.
Imagine summing the contents of an array in a loop.
If the compiler knows the array contains just
Float64
s, it can perform optimizations to compute the sum as efficiently as possible, e.g., by using specialized CPU instructions. Such optimizations cannot occur if the compiler doesn't know what type of data it's working with.
Here's an example (inspired by this Stack Overflow answer) that illustrates the impact type-stability can have on performance:
# Type-unstable because `x` is a non-constant global variable.
x = 0
f() = [i + x for i = 1:10^6]
# Type-stable because `y` is constant and therefore always an `Int`.
const y = 0
g() = [i + y for i = 1:10^6]
using BenchmarkTools
@btime f() # 16.868 ms (1998983 allocations: 38.13 MiB)
@btime g() # 190.755 μs (3 allocations: 7.63 MiB)
Note that the type-unstable version is two orders of magnitude slower! Also note, however, that this is an extreme example where essentially the entire computation is type-unstable. In practice, some type-instabilities will not impact performance very much. Type-stability mainly matters in "hot loops", i.e., in parts of the code that run very frequently and contribute to a significant portion of the program's overall run time.
Detecting Type-Instabilities with SnoopCompile.jl
Now the question is, how do we know if or where our code is type-unstable? One excellent tool for discovering where type-instabilities occur in code is SnoopCompile.jl. This package provides functionality for reporting how many times a Julia program needs to stop to compile code. (Remember that a perfectly type-stable program can compile everything in one go, so every time execution stops for compilation indicates a type-instability was encountered.)
Let's use an example to illustrate how to use SnoopCompile.jl. First, the code we want to analyze:
module Original
struct Alg1 end
struct Alg2 end
function process(alg::String)
if alg == "alg1"
a = Alg1()
elseif alg == "alg2"
a = Alg2()
end
data = get_data(a)
result = _process(a, data)
return result
end
get_data(::Alg1) = (1, 1.0, 0x00, 1.0f0, "hi", [0.0], (1, 2.0))
function _process(::Alg1, data)
val = data[1]
if val < 0
val = -val
end
result = map(data) do d
process_item(d, val)
end
return result
end
process_item(d::Int, val) = d + val
process_item(d::AbstractFloat, val) = d * val
process_item(d::Unsigned, val) = d - val
process_item(d::String, val) = d * string(val)
process_item(d::Array, val) = d .+ val
process_item(d::Tuple, val) = d .- val
get_data(::Alg2) = rand(5)
_process(::Alg2, data) = error("not implemented")
end
We'll use the @snoop_inference
macro
to analyze this code.
Note that this macro should be used
in a fresh Julia session
(after loading the code to be analyzed,
but before running anything)
to get the most accurate analysis results.
julia> using SnoopCompileCore
julia> tinf = @snoop_inference Original.process("alg1");
julia> using SnoopCompile
julia> tinf
InferenceTimingNode: 0.144601/0.247183 on Core.Compiler.Timings.ROOT() with 8 direct children
You can consult the SnoopCompile.jl docs
for more information about what we just did,
but for now,
notice that displaying tinf
revealed 8 direct children.
That means compilation occurred 8 times
while running Original.process("alg1")
.
If this function were completely type-stable,
@snoop_inference
would have reported just 1 direct child,
so we know there are type-instabilities somewhere.
Each of the 8 direct children is an inference trigger, i.e., calling the specific method indicated in the inference trigger caused compilation to occur. We can collect the inference triggers:
julia> itrigs = inference_triggers(tinf)
Inference triggered to call process(::String) from eval (./boot.jl:430) inlined into REPL.eval_user_input(::Any, ::REPL.REPLBackend, ::Module) (/cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:261)
Inference triggered to call process_item(::Int64, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Int64)
Inference triggered to call process_item(::Float64, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Float64)
Inference triggered to call process_item(::UInt8, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::UInt8)
Inference triggered to call process_item(::Float32, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Float32)
Inference triggered to call process_item(::String, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::String)
Inference triggered to call process_item(::Vector{Float64}, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Vector{Float64})
Inference triggered to call process_item(::Tuple{Int64, Float64}, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Tuple{Int64, Float64})
The first inference trigger
corresponds to compiling the top-level process
function
we called
(this is the inference trigger we always expect to see).
But then it looks like Julia had to stop running
to compile several different methods of process_item
.
Inference triggers tell us that type-instabilities existed
when calling the given functions,
but what we really want to know is
where these type-instabilities originated.
You'll note that each displayed inference trigger above
also indicates the calling function
by specifying from <calling function>
.
(Note that the from #1
in the above example
indicates process_item
was called
from an anonymous function.)
We can use accumulate_by_source
to get an aggregated view
of what functions made calls via dynamic dispatch:
julia> mtrigs = accumulate_by_source(Method, itrigs)
2-element Vector{SnoopCompile.TaggedTriggers{Method}}:
eval_user_input(ast, backend::REPL.REPLBackend, mod::Module) @ REPL ~/.julia/juliaup/julia-1.11.5+0.x64.linux.gnu/share/julia/stdlib/v1.11/REPL/src/REPL.jl:247 (1 callees from 1 callers)
(::var"#1#2")(d) @ Main REPL[1]:30 (7 callees from 7 callers)
From this,
we can see that the example code
really has only one problematic function:
the anonymous function var"#1#2"
.
Diving in with Cthulhu.jl
Now that we have a rough idea
of where the type-instabilities come from,
we can drill down into the code
and pinpoint the precise causes
with Cthulhu.jl.
We can use the ascend
function
on an inference trigger
to start investigating:
julia> using Cthulhu
julia> ascend(itrigs[2]) # Skip `itrigs[1]` because that's the top-level compilation that should always occur.
ascend
provides a menu that shows process_item
and the anonymous function.
Select the anonymous function and press Enter.
Here's a screenshot of the Cthulhu output:
Reading the output of Cthulhu.jl takes some time to get used to (especially when it can't display source code, as in this example), but the main thing to remember is that red is bad. See the Cthulhu.jl README for more information.
In this example,
the source of the type-instability
was fairly easy to pinpoint.
I annotated the screenshot
to indicate from where the type-instability arose,
which is this Core.Box
thing.
These are always bad;
they are essentially containers
that can hold values of any type,
hence the type-instability that arises
when accessing the contents.
In this particular case,
Core.getfield(#self#, :val)
indicates
val
is a variable
that was captured by the anonymous function.
Once we determine what caused the type-instability, the solution varies on a case-by-case basis. Some potential solutions may include:
- Ensure different branches of an
if
statement return data of the same type. - Add a type annotation to help out inference.
For example,
x = Any[1] y = do_something(x[1]::Int)
- Make sure a container type has a concrete element type.
For example,
x = Int[]
, notx = []
. - Avoid loops over heterogeneous
Tuple
s. - Use
let
blocks to define closures. (See this section of the Julia manual for more details.)
We'll use this last solution in our example.
The anonymous function in question
is defined by the do
block in _process
.
So, let's fix the issue of the captured variable val
:
module Corrected
# All other code is the same as in module `Original`.
function _process(::Alg1, data)
val = data[1]
if val < 0
val = -val
end
f = let val = val
d -> process_item(d, val)
end
result = map(f, data)
return result
end
# All other code is the same as in module `Original`.
end
Now let's see what @snoop_inference
says:
julia> using SnoopCompileCore
julia> tinf = @snoop_inference Corrected.process("alg1");
julia> using SnoopCompile
julia> tinf
InferenceTimingNode: 0.113669/0.183888 on Core.Compiler.Timings.ROOT() with 1 direct children
There's just one direct child. Hooray, type-stability!
Let's see how performance compares:
julia> using BenchmarkTools
julia> @btime Original.process("alg1");
220.506 ns (16 allocations: 496 bytes)
julia> @btime Corrected.process("alg1");
51.104 ns (8 allocations: 288 bytes)
Awesome, the improved code is ~4 times faster!
Summary
In this post, we learned about type-stability and how type-instabilities affect compilation and runtime performance. We also walked through an example that demonstrated how to use SnoopCompile.jl and Cthulhu.jl to pinpoint the sources of type-instability in a program. Even though the example in this post was a relatively easy fix, the principles discussed apply to more complicated programs as well. And, of course, check out the documentation for SnoopCompile.jl and Cthulhu.jl for further examples to bolster your understanding.
Do you have type-instabilities that plague your Julia code? Contact us, and we can help you out!
Additional Links
- SnoopCompile.jl Docs
- Documentation for SnoopCompile.jl.
- Cthulhu.jl Docs
- Documentation (the package's README) for Cthulhu.jl.
- Julia Performance Tips
- Very good tips for improving the performance of Julia code.
- GLCS Software Development
- Connect with us for Julia development help.
- Upcoming JuliaCon Talk Announcement
- Check out our JuliaCon 2025 talk announcement!
Subscribe to my newsletter
Read articles from Great Lakes Consulting directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Great Lakes Consulting
Great Lakes Consulting
Modeling & Simulation, HPC, and Enterprise Software all under one roof. Great Lakes Consulting Services, Inc., a premier Information Technology Consulting company, serving others in IT staffing, analytic consulting, business intelligence and application development since 2009. We now specialize in custom Julia software services as the trusted partner to JuliaHub for their Consulting Services. Since 2015, we’ve partnered together to develop high-performance Julia code for low-latency data visualization and analytic solutions, high performance financial modeling, Modeling and Simulation for multiple sciences, personal Julia training, and legacy code migration & evolution.