Bazel - Hermetic Toolchains and Libraries

At this point, we already have a grasp of what a Bazel workspace consists of, how to create packages and targets, how to use the most common commands, and how its out-of-the-box cache mechanism works.
Our next goal is to improve reproducibility, reliability, and cache hits, which also means faster incremental builds. To achieve this, we will work to ensure hermetic builds for our Bazel targets.
GitHub: examples/1_bazel_basics_part_3.
Hermetic build
According to the official Bazel documentation:
"When given the same input source code and product configuration, a hermetic build system always returns the same output by isolating the build from changes to the host system.”
While Bazel offers out-of-the-box solutions for self-contained builds, it's up to us as users to set them up. To do this, we must first identify what our code depends on from the host machine. With that done, we can then start to hermetically provide these dependencies to be as insensitive as possible to software installed on the host machine.
Toolchains
Having a hermetic toolchain is one of the most important steps to achieve a reproducible and reliable build. We won't go into details of what a toolchain is or how Bazel defines it, but for now we can consider it as the set of tools used to build our targets.
While you can create your own Bazel toolchain, I suggest starting with what is already available. Luckily, there are open source solutions for basically every programming language and even for some frameworks. For example:
C/CPP -
toolchains_llvm
- https://github.com/bazel-contrib/toolchains_llvmPython -
rules_python
- https://github.com/bazel-contrib/rules_pythonGo -
rules_go
- https://github.com/bazel-contrib/rules_goRust -
rules_rust
- https://github.com/bazelbuild/rules_rustTypescript -
rules_ts
- https://github.com/aspect-build/rules_tsNode.js -
rules_nodejs
- https://github.com/bazel-contrib/rules_nodejs
Libraries
Once a proper hermetic toolchain is set up, any dependency that your target depends on must be injected into it. This is the case with external libraries. For example, if you have a Python test and you want to rely on pytest, you will need to set it up and then add it as a dependency of your test since it is not a built-in Python module.
For some languages, it is easy to configure hermetic external libraries; for others, it might depend on whether the respective library already offers a Bazel integration or not. This usually depends on whether the language already provides a package manager solution or not. For example:
Python: has two main package managers,
pip
anduv
. Both are integrated into Bazel viarules_python
, so the user can expect to have basically the same interface.C/CPP: no package manager. While googletest offers a Bazel integration, some other libraries might not offer it, and then the user should evaluate how to do it. Sometimes simply packing the library into a tarball is enough; sometimes it might be more tricky.There are solutions like
rules_foreign_cc
that make CMake-based libraries easier to integrate into Bazel.
Sandbox
With hermetic toolchains and libraries in place, it's mostly up to Bazel to ensure that your target code cannot access your host environment to get libraries, variables, data files, or whatever dependency it needs. To achieve this, Bazel basically creates a sandbox for each action containing symlinks to all specified dependencies and to the configured toolchain as well. This means that actions (e.g., build or test) will only have access to what was strictly specified.
Of course, this doesn't mean Bazel's sandbox is bulletproof. Actually, it is not that hard for a user to break the sandbox, Bazel even has options to allow that, but then it will be explicit that an action is accessing something from the host system.
Bazel's sandbox can also be a bit costly, as it needs to create a lot of symlinks or, in the case of Windows, it may also do hard copies, but they are essential to guarantee hermetic builds. I can say that once you get used to it, you will not want to go back to non-hermetic actions.
Common pitfalls
These can happen for mysterious reasons. But the most important thing is: don't fool yourself. If you can't make everything hermetic at once, tag non-hermetic actions and solve those later—it happens. Bazel even has some special tags to set targets with.
The following list is what I can remember at the moment, but there are surely others:
Accessing hard-coded absolute paths or somehow finding your own way to break the sandbox
Relying on network or on timestamps
Running subprocesses that might not be hermetic (e.g., they might rely on the current user directory)
Relying on host system features which are usually taken for granted like bash, shell, PowerShell, zip, etc
Setting up hermetic toolchains and libraries but not configuring Bazel to use them
Well, and many others
Bazel also documents the basics of how to identify and troubleshoot non-hermetic builds.
Examples
Finally, some code! Sorry for all that text. Let's jump right into it. We're going to start from where we left off in Bazel Basics - Part 2.
C++ Hermetic Build Example
Let's start by setting up a hermetic LLVM
toolchain and also making the googletest
library available. Add the following to your MODULE.bazel
file:
# MODULE.bazel
# Existing code ...
bazel_dep(name = "toolchains_llvm", version = "1.4.0")
llvm = use_extension("@toolchains_llvm//toolchain/extensions:llvm.bzl", "llvm")
llvm.toolchain(
llvm_version = "16.0.0",
)
use_repo(llvm, "llvm_toolchain")
register_toolchains("@llvm_toolchain//:all")
bazel_dep(name = "googletest", version = "1.17.0")
With that, we already have an hermetic LLVM
toolchain and you can use it by simply executing, for example, a bazel build //cc_example/…
command to build those targets. Note that Bazel will first download it and then use it to build the targets.
The next step is to use googletest
. For that we must modify both cc_example/BUILD
and cc_example/test.cpp
files. Replace the //cc_example:test
target definition with:
# cc_example/BUILD
# Existing code ...
cc_test(
name = "test",
srcs = ["test.cpp"],
deps = [
":lib",
"@googletest//:gtest", # Comment this line out and run the test again.
"@googletest//:gtest_main", # Comment this line out and run the test again.
],
)
And replace the cc_example/test.cpp
file content with:
// cc_example/test.cpp
#include "lib.h"
#include <gtest/gtest.h>
TEST(HelloTest, BasicAssertions)
{
printMessage("Unwrap Your Build");
EXPECT_EQ(7 * 6, 42);
}
At this point you can build and test the C++ targets. Bazel will fetch googletest
and inject it into the //cc_example:test
target. If you try to comment out those googletest
dependencies, the build command will fail.
Python Hermetic Build Example
Let's start by setting up a hermetic Python toolchain and its package manager, pip
, with pytest
as the only library. Add the following to your MODULE.bazel
file:
# MODULE.bazel
# Existing code ...
bazel_dep(name = "rules_python", version = "1.5.3")
python = use_extension("@rules_python//python/extensions:python.bzl", "python")
python.toolchain(
python_version = "3.13",
)
pip = use_extension("@rules_python//python/extensions:pip.bzl", "pip")
pip.parse(
hub_name = "our_pip_hub",
python_version = "3.13",
requirements_lock = "//:requirements_lock.txt",
)
use_repo(pip, "our_pip_hub")
The next step is to create our requirements files. There are two: requirements.in
and requirements_lock.txt
. The former is our input and the later is the pip-resolved version of that. Create an empty requirements_lock.txt
file and a requirements.in
file containing the following:
# requirements.in
pytest==8.4.1
Now add the following code to your root BUILD
file:
# BUILD
load("@rules_python//python:pip.bzl", "compile_pip_requirements")
# Available executable targets:
# - //:py_requirements.update
# - //:py_requirements.test
compile_pip_requirements(
name = "py_requirements",
src = "requirements.in",
requirements_txt = "requirements_lock.txt",
)
At this point, you can run bazel run //:py_requirements.update
. This will basically resolve and lock the requirements from requirements.in
into requirements_lock.txt
.
With requirements_lock.txt
populated, pytest
is finally available, but, before using it, we need to modify the python test target and file. Modify py_example/BUILD
with:
# py_example/BUILD
load("@rules_python//python:defs.bzl", "py_binary", "py_library", "py_test")
# Existing code ...
py_test(
name = "test",
srcs = ["test.py"],
deps = [
":example", # It implicitly depends on `:lib` as well.
"@our_pip_hub//pytest:pkg", # Comment this line out and run the test again.
],
)
And replace py_example/test.py
content with:
import sys
import pytest
from py_example import example
@pytest.mark.parametrize("expected", [None])
def test(expected):
assert example.main() == expected
if __name__ == "__main__":
sys.exit(pytest.main([__file__]))
If you now test it by executing bazel test //py_examples/…
, Bazel will first download: the rules_python
repo, the Python toolchain containing the selected interpreter version, the locked pytest
version using pip
, and then run all tests under the //py_examples
package.
Commands
I recommend you to play around the current setup to get more familiar with it. Some interesting ideas are to try to add more libraries or to remove the dependencies that were added above. I wrote some ideas on examples/bazel_hermetic_toolchains_and_libs/README.md
.
Final Thoughts
As I already mentioned, Bazel only offers us solutions to achieve hermetic builds and it's up to us to properly configure them. To be honest these setups can get as complex as we want, but the basic infrastructure which gives already a lot of benefits is relatively easy to achieve.
My main advice is to go step by step and to be honest with your code and yourself, do not try to mask something that is not hermetic as hermetic. Instead, mark it with the proper Bazel tag and make the action as non hermetic so you can improve it in the future. It's always a trade-off between how much hermetic (i.e., how much insensitive from the host machine) you want your targets to be and how much time you have to spend on the setup.
I'm still not sure what the next article will be about, probably about Bzlmod and Bazel Central Registry. Let me know if you have any other ideas.
Subscribe to my newsletter
Read articles from Lucas Munaretto directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
