Cross-Compiling 10,000+ Rust CLI Crates Statically

AjamAjam
10 min read
πŸ€–
AI translation tools were used to assist with language clarity, as our research team are not native English speakers. We believe the quality of our research and findings will speak for themselves

Pkgforge hosts the world's largest collection of prebuilt, static binaries that work everywhere without dependencies. While our main repos include hand-picked packages and manually maintained by our team, we had an ambitious idea: what if we could automatically harvest CLI tools from ecosystems like Rust's crates.io, build them as static binaries, and made them available to everyone?

Instead of manually curating every package, we decided to tap into existing package ecosystems and automate the entire process. After two weeks of intensive development and countless iterations, we made this idea a reality.


Ingesting Crates.io

Crates.io provides api access for individual crate lookups and bulk operations. Initially, our script iterated through the first 1,000 pages (sorted by downloads) with 100 crates per page, yielding approximately 111,000 crates. However, we soon encountered a significant bottleneck: we needed to query each crate individually to determine if it belonged to the command-line-utilities, or produced executables, i.e. contained [[bin]] in their manifest.

This approach proved impractical as we quickly hit rate limits and potentially violated the Usage Policy. Fortunately, RFC-3463 came to our rescue. Crates.io provides periodic database dumps at https://static.crates.io/db-dump.tar.gz . We quickly drafted a nifty cli using dtolnay/db-dump

Then it was just a matter of parsing this with a bit of jq, & automating it via GitHub Actions. Our workflow now generates all the data we will need, automatically.


Crate Selection

Since we ended up with over 111,000 crates, we needed to set some constraints & filter for what we actually wanted to build:

  1. Should either be of category command-line-utilities: categories = ["command-line-utilities"]

  2. Or must have a [[bin]] section in the Manifest.

  3. Must be updated within the last year, i.e > 2024-01-01

πŸ¦€ Total crates fetched:            β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 111,722
     ↓
πŸ“¦ Crates with binary targets:      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  24,658
     ↓
πŸ”§ Crates with CLI category:        β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘   8,451
     ↓
πŸ“† Recently updated (2024+):        β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘  10,033
     ↓ + ↑
βœ… Crates to Build: 10,033

// Total crates whose metadata was scraped/fetched: 111,722  
// Total crates with [[bin]] in their Cargo.toml manifest: 24,658  
// Total crates with command-line-utilities category in their Cargo.toml manifest: 8,451  
// After filtering for outdated crates (> date -d 'last year' '+%Y-01-01')  
// Total crates to build: 10,033

We ended up with ~ 10,000 crates that we now planned to compile.


Build Constraints

To achieve truly portable, optimized, and statically linked binaries, we applied the following comprehensive build constraints:

#RUSTFLAGS
[+] Flags: -C target-feature=+crt-static \
           -C default-linker-libraries=yes \
           -C link-self-contained=yes \
           -C prefer-dynamic=no \
           -C embed-bitcode=yes \
           -C lto=yes \
           -C opt-level=3 \
           -C debuginfo=none \
           -C strip=symbols \
           -C link-arg=-Wl,-S \
           -C link-arg=-Wl,--build-id=none \
           -C link-arg=-Wl,--discard-all \
           -C link-arg=-Wl,--strip-all
  1. Statically Linked: -C target-feature=+crt-static

  2. Self Contained: -C link-self-contained=yes

  3. All Features: --all-features

  4. Link Time Optimization: -C lto=yes

  5. All Optimizations: -C opt-level=3

  6. Stripped: -C debuginfo=none -C strip=symbols

  7. No System Libraries: Crates with system library dependencies will fail by design, as we target pure Rust implementations

    #These crates would error out in the following manner
    error: could not find system library 'openssl' required by the 'openssl-sys' crate
    error: Could not find system library 'sqlite3'
    error: pkg-config has not been configured to support cross-compilation
    

Build Tool

With over 10,000 crates to build on individual GitHub Actions runners, speed was paramount. While Cargo offers cross compilation features, it requires significant setup overhead. We needed a solution that worked out of the box.

Our heavy docker images used for official packages consumed 2-3 minutes just for pulling and extraction, making them unsuitable for this scale. This left us with rust-cross/cargo-zigbuild & cross-rs/cross. After some local testing, we decided to use Cross as it supported all the targets we needed & worked as advertised: β€œZero setup” cross compilation

We also used jpeddicord/askalono to automatically detect & copy over licenses.

#The CMD Looks like
cross +nightly build --target "${RUST_TARGET}" -Z unstable-options \
     --all-features \
     --artifact-dir="${C_ARTIFACT_DIR}" \
     --jobs="$(($(nproc)+1))" \
     --release \
     --verbose

Build Targets

While Soar supports any \Unix-based Distro*, due to lack of CI support for other Unix Kernel on GitHub Runners (natively, not VMs), we are limited to Linux only. We further refined our target matrix by excluding architectures approaching end-of-life:

HOST_TRIPLETRUST_TARGET
aarch64-Linuxaarch64-unknown-linux-musl
loongarch64-Linuxloongarch64-unknown-linux-musl
riscv64-Linuxriscv64gc-unknown-linux-musl
x86_64-Linuxx86_64-unknown-linux-musl

Build Security

We are aware of issues like https://github.com/rust-lang/cargo/issues/13897, so we wanted this to be as secure as our official repositories, by ensuring:

These measures ensure that even if a malicious crate attempts to compromise the system, its impact is isolated and cannot affect other crates' integrity.


Build Workflow

10,000 multiplied by 4 targets, meant we would need to run ~ 40,000 instances of CI & also handle metadata, sanity checks, uploading to ghcr, all at the same time. We also set up a discord webhook to stream real-time progress updates to our discord server.

graph TD
  A[πŸ¦€ Crates.io API] -->|πŸ“Š Scrape Metadata 🧬| B[πŸ” Filter & Queue ⏳]
  B --> C[πŸ—οΈ GitHub Actions Matrix πŸ”„]
  C --> D[βš™οΈ Cross Compiler βš’οΈ]
  D --> E[πŸ“¦ Static Binaries πŸ› οΈ]
  E --> F[πŸ—„οΈ GHCR Registry ⬆️]
  F -->|πŸ“Š Generate Metadata 🧬| B
  F --> G[πŸš€ Soar ⬇️]


Key Insights and Findings

Build Success vs. Failure

We approached this project with optimistic expectations but encountered a sobering reality. Out of approximately 10,000 crates queued for building:

πŸ—οΈ Build Pipeline by Success Rate
────────────────────────────────────────────────────────────────────────
βœ… Queued    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    10,033 (100.0%)
βš™οΈ Built     β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                  5,779 (57.60%)
❌ Failed    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                                 4,254 (42.40%)
────────────────────────────────────────────────────────────────────────

So what went wrong? We sampled about 100 of these error logs & concluded:

  1. System Library Dependencies: The majority of failures stemmed from crates requiring system libraries that weren't available in our static build environment

  2. Custom Build Systems: Many crates include build.rs files that fail when specified dependencies aren't met or when detecting system features during cross-compilation

     #These typically fail cross-compilation
     build.rs files that:
     - Detect system features
     - Link against system libraries
     - Generate code based on target environment
    

Despite years of Rust ecosystem maturation, system library dependencies remain the primary obstacle to universal static compilation. This reinforces our strategy of targeting CLI tools that can be fully statically linked.


Crates vs Executables

Another interesting insight from building at scale: many crates produce multiple executables. The ~ 5,800 crates we attempted generated ~ 21,000 individual executables (Also referred to as binaries or packages)

πŸ—οΈ Build Pipeline by Executables
──────────────────────────────────────────────────────────────────────────────────
πŸ“¦ Crates Built       β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                                   5,779 (100.0%) #Base Line
βš™οΈ Total Executables  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ     21,042 (364.0%)
──────────────────────────────────────────────────────────────────────────────────

This 3.6:1 ratio reveals how rich the Rust CLI ecosystem actually is.


Native vs Cross

β„Ή
This counts the executables generated & not individual crates. A single crate may generate multiple executables. (See Above)
πŸ—οΈ Build Pipeline by Architecture
─────────────────────────────────────────────────────────────────────────────────
πŸ–₯️  x86_64-Linux     β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ     5,627 (100.00%) #Base Line
πŸ–₯️  aarch64-Linux    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ     5,586 (99.30%)
πŸ–₯️  riscv64-Linux    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž       5,370 (95.40%)
πŸ–₯️ loongarch64-Linux β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹              4,459 (79.20%)
─────────────────────────────────────────────────────────────────────────────────

The consistent success rates across architectures demonstrate Rust's excellent cross-platform story, though newer architectures like loongarch64 show slightly lower compatibility rates. This suggests that architecture-specific code assumptions remain common in the ecosystem.

An interesting anomaly: Despite building 5,779 crates successfully, x86_64-Linux only shows 5,627 executables. This discrepancy occurs because some crates successfully build for non-standard targets like loongarch64-Linux and riscv64-Linux but fail for standard architectures due to build hooks and scripts that trigger differently across targets.

You can explore detailed per-target build results here: CRATES_BUILT.json


CI Performance Metrics

Our primary build workflow (matrix_builds.yaml) handles the bulk of compilation, with additional workflows managing metadata and miscellaneous tasks. As we implement incremental builds (only rebuilding updated crates) and caching strategies, these metrics will improve significantly.

Average build time was ~ 2 minutes.


Review

Compilation vs. Prebuilt Distribution

Compilation will always be slower than fetching prebuilt binaries, but the degree varies significantly based on crate complexity and dependency count. For our demonstration, we'll use fd-find as a representative example, though your experience may vary with more dependency-heavy crates.

Note: We're not measuring CPU, disk, memory, or bandwidth usage hereβ€”try it yourself to experience the full performance difference.

Cargo

$ time cargo install fd-find

real    0m51.893s
user    3m36.568s
sys     0m24.411s

Cargo Binstall/Quick Install

Cargo Binstall leverages prebuilt binaries, though it requires time for crate resolution: related issue

$ time cargo binstall fd-find --no-confirm

real    0m6.001s
user    0m0.118s
sys     0m0.083s

Cargo-Binstall and Cargo-Quickinstall are excellent tools that:

  • Integrate with cargo install workflow

  • Handle development dependencies and features

  • Target developers who want faster cargo install

Soar takes a different approach:

  • Distribution-focused: Static executables for end users

  • No development integration: Not meant for cargo workflows

  • Dependency-free: Zero system library requirements

  • Cross-distribution: Works on any *nix system (MUSL/GLIBC)

Soar

#Soar uses pkg_ids to ensure exact match because we have too many packages
$ time soar install "fd#pkgforge-cargo.fd-find.stable:pkgforge-cargo"

real    0m1.695s
user    0m0.062s
sys     0m0.090s

Conclusion

This project represents more than just a build farm; it's a proof of concept & also a reality check for the whole ecosystem.

Key Discoveries and Implications

The Rust CLI ecosystem is remarkably rich and diverse. Our 3.6:1 ratio of executables to crates reveals that the community is building comprehensive toolsuites rather than single-purpose utilities. This multiplier effect means that successfully building even a subset of available crates provides exponentially more value to end users.

Cross-compilation compatibility has room for improvement. While Rust's cross-platform story is generally excellent, our 42.4% failure rate highlights that system library dependencies and architecture-specific assumptions remain significant obstacles. This suggests opportunities for the community to develop more portable alternatives to system library bindings.

Static linking is both powerful and challenging. The ability to produce truly portable binaries that work across any Linux distribution without dependencies is transformative for CLI tool distribution. However, achieving this requires careful consideration of build flags, dependencies, and compilation strategies.

Broader Ecosystem Implications

Our work demonstrates that automated, large-scale binary distribution is not only feasible but can provide significant value to the developer community. The time savings aloneβ€”from nearly a minute of compilation time to under two seconds of download timeβ€”represent a meaningful improvement in developer productivity.

More importantly, this approach democratizes access to CLI tools. Users no longer need to have Rust installed, understand compilation flags, or debug dependency issues. They can simply install and use tools, lowering the barrier to entry for adopting Rust-based CLI utilities.


Future Roadmap

The pkgforge-cargo project will likely see these additions/improvements in the near future:

  • Automated updates: Rebuild crates when new versions are published (this is partially implemented)

  • Integration with Cargo: Maybe something similar to what `cargo binstall` does.

  • Build optimization: Optimize CI Build times & reduce Failures

  • Contribute Upstream: Opt-in system to automatically create GitHub issues with build logs when crate compilation fails, helping maintainers improve cross-compilation compatibility

  • Community Feedback: Listen to our users & the community to improve this project & hope for a widespread adoption beyond Soar.

As we continue to refine and expand this system, we're excited about its potential to influence how the broader software community thinks about binary distribution. The lessons learned here apply beyond Rust to any compiled language ecosystem, and we're eager to explore applications in Go, Zig, and other emerging systems languages. (Help us if you can)

The ultimate goal is to create a world where installing and using CLI tools is as simple as possible, regardless of the underlying programming language or system dependencies. This project represents a significant step toward that vision, and we're committed to continued innovation in this space.

We invite the community to engage with this work, contribute improvements, and help us build a more accessible and efficient software distribution ecosystem. Together, we can make powerful CLI tools available to everyone, everywhere, without the traditional barriers of compilation and dependency management.


0
Subscribe to my newsletter

Read articles from Ajam directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ajam
Ajam