The 'outside of Bazel' pattern


The Bazel build tool is fantastic for taking a well-defined dependency graph, which is really a tree coming up from the root (the artifact to be built or test to be run), and progressing through increasingly wide branches of direct and transitive dependencies, all the way up to the leaves, which are source files that live in your repository, or possibly even third-party sources.
However, I often see developers struggling to model something that’s not a tree. Sometimes it’s a bush, or a chandelier. This is usually a sign that Bazel’s dependency + action graphs won’t work well, due to bad ergonomics and ruined incrementality.
Bazel dogma teaches us that all logic should be in Starlark (Bazel’s extension language) and described in BUILD
files. I’ll show a few examples where this isn’t the “Right Tool for the Job”.
However, I’ll make a stronger case: Bazel is the inner core of a wider system. The core really only performs two jobs well:
Inspect the dependency and action graphs (
aquery
andcquery
)Populate a subset of
bazel-bin
andbazel-testlogs
(build
andtest
- though the latter can really be thought of as “build text files containing all the test runner exit codes”)
The wider system is a “task runner”. Makefile
commonly serves this purpose, surrounding Bazel commands, but it has a trap: it overlaps with Bazel’s capabilities and makes it impossible to ensure you don’t have Build steps sneaking into the outer layer. At BazelCon this year, I’ll present a better task runner that lets you write tasks in Starlark. In the meantime, I’ll just illustrate the task runner layer with some Bash one-liners.
An archive of the whole repo
Our first example is common in Bazel rulesets. You want an archive that represents the whole source repository, so the shape is “take all the leaves and connect them directly to the root”.
This doesn’t work well in Bazel because packages are encapsulated, so a glob([**/*])
doesn’t gather up all the sources in subpackages. Instead you need an awkward tree of filegroup
targets in every package that are linked together.
The alternative is to use the Right Tool for the Job: git archive
. It has some lesser-known configuration options you can set in the .gitattributes
file (see https://git-scm.com/docs/git-archive#ATTRIBUTES) that help a bunch:
Filtering out contents: instead of Bazel
glob(excludes=[])
you can useexport-ignore
patternsVersion Stamping the result: use
export-subst
to configure which file is stamped, then something like the following in that file:_VERSION_PRIVATE = "$Format:%(describe:tags=true)$" VERSION = "0.0.0" if _VERSION_PRIVATE.startswith("$Format") else _VERSION_PRIVATE.replace("v", "", 1)
All the “Something” targets
This one comes up a lot. Recently I’ve been working on distributing API documentation which is generated for code across the repo. You’ll recognize this pattern whenever it seems like bazel query 'some expression` | xargs bazel build
is the model of what you want to ask Bazel for.
It’s tempting to model this is a “collector” target that’s just a long list of deps
and then wonder “how am I going to keep this list of deps up-to-date as we add more something
targets?” You can’t and shouldn’t.
The biggest reason to avoid this one is the shape of dependency graph you end up with. Developers will commonly trip over analyzing this target (just doing a query over the repo, or load
ing that package will do it). Then Bazel goes from incremental “only do the minimal work for the targets I requested” to performing a whole-repo step that downloads gigabytes of irrelevant tooling.
This time, the Right Tool for the Job is a small workflow outside of Bazel. Create a query expression that matches the targets you care about, and selects the outputs you need from them. We do this for the lint
command to select “all the report files” for example. Here’s a full code listing for the API docgen task:
docs="$(mktemp -d)"
bazel --output_base="$docs" query --output=label 'kind("starlark_doc_extract rule", //...)' \
| xargs bazel --output_base="$docs" build --remote_download_regex='.*doc_extract\.binaryproto'
tar --create --auto-compress \
--directory "$(bazel --output_base="$docs" info bazel-bin)" \
--file "$GITHUB_WORKSPACE/${ARCHIVE%.tar.gz}.docs.tar.gz" .
Compare with another version of the code
buf_breaking is a good example. It wants to see the prior state of the output (say at the Base commit of a Pull Request), and compare with the current one. Bazel sees a single snapshot of the source code for a given build. I’ve seen some customers write a repository rule to clone a different commit of the repository, which seems very brittle to me. Checking in the prior output is too hard to automate.
The Right Tool for this Job is to find CI artifacts from the Base commit for a given change, and run a comparison/validation tool after the build runs. Then write an updated artifact from builds on the main
branch for subsequent comparisons.
gazelle
BUILD file generation is in this category, because Bazel has always refused to allow dynamic dependency graphs based on file contents. The team argues that the “no-op” build has to remain fast.
But it doesn’t matter how fast their tool is, if every user is then forced to wrap it in something slower. In this case, we always want something to run before Bazel’s loading phase: a step like how C++ builds run autoconf
with a ./configure && make
workflow.
Today engineers mostly have to discover their BUILD files are outdated (maybe there’s a compilation error about a missing dependency) and then do a manual bazel run //:gazelle
- but if we had a Task Runner layer around Bazel, we’d just setup a step to run ahead of time.
(By the way, at BazelCon I’ll present two things: we can use Starlark to write the task that invokes Gazelle, and we can also extend Gazelle’s BUILD generation logic in Starlark!)
Coverage
Bazel has a coverage
command, so why is this example here? Well, in my experience, it was a mistake - this command wanted to live in the task runner layer, but Google never wrote one. The coverage system is bad mostly because of things like lcov transformation and merging, and how difficult it is to configure.
Coverage really should have been formulated as a Task Runner that:
Builds the code under a Transition that enables an Instrumentation Configuration (pokes counters into the executable to track how many times a line or statement executes)
Runs the tests as usual. The coverage data files are configured to be additional outputs (using the
TEST_UNDECLARED_OUTPUTS
feature my intern added (hi John!))After the tests are complete, collects the resulting data files. They might be LCOV format, or something else that needs to be transformed.
Presents the results, frequently by consulting the VCS so you can show incremental coverage (how many of the added/edited lines were tested)
Run
Even the bazel run
command is a mistake in my opinion. It’s a nice “syntax sugar” for building a single target, then spawning it as a subprocess. But it’s missing things like watch
mode, we needed a separate rules_multirun
to get multiple servers to start, and has a ton of bugs around how the working directory is selected.
If we had a Task Runner, it would clearly not be Bazel’s job to do these things.
print
Buildozer is a great tool for machine-editing BUILD files, but also for quickly inspecting their contents in a purely syntactic pass that doesn’t trigger Bazel’s fetching, Loading, and Analysis phases. With a Task Runner layer, we can easily expose these Bazel-adjacent tools through the same interface engineers use to request build outputs, instead of making them install and learn about a variety of other tools. So we’d add a print
task that doesn’t even invoke Bazel at all!
Next
To learn more about Aspect Build's Bazel developer workflow platform and professional services, visit aspect.build.
Subscribe to my newsletter
Read articles from Alex Eagle directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Alex Eagle
Alex Eagle
Fixing Bazel!