Experiment with Buf and Starlark Docgen
This post documents some experimentation. It's not useful guidance for you to follow today!
I've spent a lot of time thinking about API docs for Bazel rules. I spent several weeks writing docs.aspect.build/rules with very thoughtful presentation of those APIs. These are written in the "Starlark" language, so we need two things:
an extractor that uses a starlark interpreter to parse the code, including
load
statements to other files that contain bits of documentationa renderer to format the documentation in a readable way, typically in Markdown or HTML
Everyone has historically used https://github.com/bazelbuild/stardoc for these tasks. I've found it problematic recently for a variety of reasons:
It expects
bzl_library
targets which come from a module that's no longer accepting PRs: https://github.com/orgs/bazelbuild/discussions/3The gazelle generator for
bzl_library
is a separate module in the bazel_skylib repo which makes it a pain to install. It also requires building Go code from source in the user's build, even if they only changed a character in a docstring and have no Go code.Stardoc itself requires building
protoc
from source, which demands a functional C++ compiler in the user's build. They recently rejected a fix for this which frustrated me.Stardoc's rendering uses a clone of Velocity which is an ancient Java thing I used 20 years ago. (Actually I preferred FreeMarker!)
Because it doesn't ship the renderer as a pre-built binary, Stardoc leaks a
maven_install
into the user's build. Then you get yelled at if you have other Maven dependencies, due to something in rules_jvm_external. I've had other problems with their bzlmod-ification and breaking changes.
Finally, and maybe I'm biased: Stardoc feels hard to contribute to because it's controlled by a Google team who seem very overcommitted on other things. It needs a lot of love - for example bullet lists appearing inside a markdown table never look right. I appreciate their work! And I'll probably send more PRs there.
But I'm curious how easily it can be replaced. Let's use it as an excuse to learn and experiment!
Extraction is a Bazel native rule
As of Bazel 7, https://bazel.build/reference/be/general#starlark_doc_extract is built-in to Bazel, and stardoc 0.6.0 started using that extractor. We don't need Stardoc at all to pull the docstrings out of our Bazel rules and macros. This rule wraps the ModuleInfoExtractor.
I wish that was a standalone
java_binary
program so we could just visit ourbzl_library
targets with an Aspect! Then docgen wouldn't have to be spelled out inBUILD
files at all! But that's a bigger project, so for this experiment I'll just usestarlark_doc_extract
rules in the BUILD files of the module.
# No load() statements required!
starlark_doc_extract(
name = "defs.doc_extract",
src = ":defs.bzl",
)
It produces a binary protocol buffer output. Ugh, does that mean we need rules_proto
and the whole ugly process of figuring out Bazel's Protobuf story? Thankfully no, we can rely on the Buf Schema Registry to skip this whole step and immediately parse the result.
Buf Schema Registry
I think of it by analogy: Aspect is to Bazel as Buf is to Protocol Buffers. They're making an awesome end-user experience for a Google technology that has previously been really painful to use outside of Google's monorepo (google3).
In google3 there's a "global proto DB" which famously ran on Jeff Dean's desktop computer. (There was a build outage once when he was on vacation and his credentials expired). Buf runs a registry that provides a similar "global database" of schema definitions. You can use a private instance for the schemas within your company. For this example, I need the public Bazel schema, here it is for Bazel 7.2.1 (latest version as of writing) https://buf.build/bazel/bazel/docs/7.2.1
We can search that site for the schema describing the starlark extractor output, and find that it's stardoc_output.ModuleInfo
. We're already up and running and can work with the data! Here's what that looks like on the command line using buf convert
:
alexeagle@aspect-build % bazel build example:defs.doc_extract
INFO: Analyzed target //example:_defs.doc_extract (5 packages loaded, 9 targets configured).
INFO: Found 1 target...
Target //example:defs.doc_extract up-to-date:
bazel-bin/example/defs.doc_extract.binaryproto
alexeagle@aspect-build % buf convert buf.build/bazel/bazel --type=stardoc_output.ModuleInfo --from=bazel-bin/example/defs.doc_extract.binaryproto
{"ruleInfo":[{"ruleName":"my_rule","attribute":[{"name":"name","docString":"A unique name for this target.","type":"NAME","mandatory":true}],"originKey":{"name":"my_rule","file":"//example:defs.bzl"}}],"funcInfo":[{"functionName":"my_macro","parameter":[{"name":"name","mandatory":true},{"name":"kwargs"}],"originKey":{"name":"my_macro","file":"//example:defs.bzl"}}],"moduleDocstring":"These are some bazel rules.\n\nThe docstring is multiple lines.","file":"//example:defs.bzl"}%
Writing a renderer
It's neat to see our docs as a JSON object. To see it as Markdown, we just need a template engine and a markdown template. There are TONS of options for this. Today I'll pick Handlebars which is light-weight and written by the famous and trusted Yehuda Katz.
The Buf schema registry can act as a language-specific package registry, so it's just as easy for me to wire this JavaScript library to read the binary protobuf data. I just need to install the bazel/bazel
schema package, choosing the protobuf-es
SDK. You can click around on https://buf.build/bazel/bazel/sdks/7.2.1 to see the options for your language. I quickly arrive at the incantation npm install @buf/bazel_bazel.bufbuild_es
.
From our docsite code behind docs.aspect.build I already had a TypeScript program that uses Handlebars to render Markdown. Here's a short pseudocode so you can see the protobuf SDK being used to unmarshal the data:
import { ModuleInfoSchema } from '@buf/bazel_bazel.bufbuild_es/src/main/java/com/google/devtools/build/skydoc/rendering/proto/stardoc_output_pb.js'
import { fromBinary } from '@bufbuild/protobuf'
import Handlebars from 'handlebars'
const doc = fromBinary(ModuleInfoSchema, fs.readFileSync(argv[0]))
const template = Handlebars.compile(fs.readFileSync(<path to module.tmpl.md>))
console.log(template({doc}))
That's enough to create a program we can call to convert binary proto to markdown. To make it usable, we can expose a macro for use in our BUILD files:
load("@aspect_rules_js//js:defs.bzl", "js_run_binary")
def starlark_doc(name, src, out = None, deps = [], **kwargs):
out = out or name + ".md"
extract_target = "_{}.doc_extract".format(name)
native.starlark_doc_extract(
name = extract_target,
src = src,
deps = deps,
)
js_run_binary(
name = name,
srcs = [extract_target],
tool = Label("//docgen/starlark:render"),
args = ["$(rootpath {})".format(extract_target)],
stdout = out,
)
This just wires the extractor and renderer together for easy use. The result is then available for whatever we'd like to do with generated documentation.
Working code for this blog post is at https://github.com/alexeagle/rules_docgen - maybe this will turn into something usable in the future. For now, it was interesting to learn more about Stardoc and Buf!
Subscribe to my newsletter
Read articles from Alex Eagle directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by