In the previous part, we discuss about why solely following standard advice often isn’t enough to boost your build performance. In this part, we will explore strategies to really unlock your builds— while also keeping them consistently optimized and ensuring no regression silently sneak back in.

Keys to improving build performance

Understand the tools

If you have read through the previous part, you are likely noticed a common pattern behind why simply enabling the recommended features fall short: developers often do not understand how the tools work. They tends to understand what these features do, but not how they do it. And that how is essential when it comes to troubleshooting build issues and getting the most out of each feature.

At the end of the day, improving build time ultimately comes down to making ways for the existing features to work at their best. There’s no need to introduce new techniques but adjusting the system so that the Gradle features can do what they are designed to do. Therefore, understanding the tools deeply is the inevitable path to the success.

Take Build Cache as an example, its concept is straightforward, not new and relatively easy to understand: it skips task execution if there are no changes between builds. Most developers would easily grasp the idea, or the what. However, imagine having a problem where a second build is launched right after the first one without modifying any thing, and yet some tasks still get executed. Knowing the what might help you spot that something’s off, but it will not guide you how to investigate and fix it. This is where understanding the how truly matters. Moreover, knowing the concept input-output of Build Cache helps ensure you do not introduce regression while extending your build scripts.

In Android ecosystem, understanding the tools starts with understanding Gradle itself and its build mechanisms: Gradle Daemon, Build Cache, incremental build, parallelism and so on. Next comes the Android Gradle Plugin (a.k.a AGP), another major tool in managing Android-specific builds. A solid grasp of AGP helps you navigate platform specific concerns like build types, variants, minification and others. Beyond that, it’s also essential to understand the Programming Languages involved and their compilation processes. Knowing how Java and Kotlin handle compilation—especially features like compilation avoidance and the differences between the two—allows you to write production code that’s much more build-friendly. Lastly, familiarity with the underlying build environment—the JVM—can be a great advantage when it comes to optimizing build memory usage or tuning other performance-related aspects.

Understand your project

To choose the right tools or setups, knowing your project’s characteristics and what it needs is everything. Get inspired by solutions from other projects, but always remember that your project is different.

For instance, both Android build and Java servers run on JVM, but they are optimized for different goals. A Java server JVM is tuned for runtime performance, focusing on high throughput, low latency, and efficient resource management (memory, CPU) to handle concurrent requests and long-running processes.. On the other hand, Android builds are short-lived processes that prioritize build speed, consistency and toolchain stability.

As a result, a GC optimized for a server JVM may not work for an Android build. Dalvik, the Android runtime, is a great example for this distinction, as it is designed specifically for mobile application runtime environment with different prioritizes.

Another example, if your project is structured as a large single module with a high volume of unit tests, enabling parallel test is critical and helps significantly speed up your testing task.

Parallel test execution

However, if your project is already structured as a multi-module setup, this option generally won’t provide much additional benefit when running all unit tests across the entire project—and in some situation, it may even backfire the build with the cost of spawning extra processes.

Note that there are still cases where this feature is quite useful even in a multi-module project—I will show an example of that in a later section.

Understand build environment

So far, we have seen that tools’ efficiency can vary much depending on projects. Another key factor that contributes significantly to the effectiveness of these solutions is the build environment.

Building in local is different than building in CI, or in Android context, building a debug version is different than building a release one.

Locally, incremental builds and build cache are incredibly efficient because cache are stored on your machine and you probably have enough space to retain them over time. Furthermore, the differences between consecutive builds are often minimal, this allows JVM to optimize itself and improve performance across builds. That said, building in local is often faster and you have the flexibility to fine-tune them based on your work machine’s characteristics.

In contrast, building on CI is a bit trickier. For instance, if you are using ephemeral agents, you don’t have a persistent storage on agents to retain cache data. As a result, incremental build and build cache can’t function as straightforwardly as they do in local environments. In addition, the JVM’s ability to optimize itself over successive runs is completely lost.

Note: Actually, while there is no way to preserve incremental builds, you can always be able to apply build cache in your CI setup. I will talk about this in more detail in the next section.

Secondly, CI agents often come with hardware limitations. If unfortunately you aren’t given a well-provisioned setup, it will be challenging to adjust the builds to run effectively on these constrained agents.

With that, knowing the characteristics and limitations of your build environment is important— because even the best tools can fall short if the environment isn’t set up to support them.

Do proper benchmarks and monitoring

If there is not one-size-fits-all solution, and the effectiveness of build tools depends heavily on the project and environment, how can we ensure that a solution actually works well for your use cases?

Well, benchmark it!

Only benchmarking can validate the efficiency of a solution. There’s no theory that can guarantee a specific Garbage Collector would fit your system well— benchmarking is a reliable way to finding out.

Besides, setting up a build metrics monitoring system is just as important. Similar to benchmarking, it help approve a solution’s outcomes. However, benchmarking is a tricky business— it requires deep expertise as it is not always easy to accurately simulating a real-world system. If any part of the setup doesn’t reflect your actual production scenario, the benchmark result may be misleading. On the other hand, a monitoring system captures data from real system usage, making it far more trustworthy and 100% reflective of how your builds truly perform. The only tradeoff of a monitoring system is it takes time to gather data, while benchmarking gives quicker insights.

Beyond validation, a monitoring system also offers other benefits: it provides a holistic view about build heath, helps detect bottlenecks or spot areas for potential improvement.

Ultimately, benchmarking and monitoring provide the data-driven foundation needed to validate any build solution—without them, applying new tools should be done with caution.

Real world scenarios

Enough with the theory, let’s dive into a series of practical solutions I’ve applied, as a part of my team, while optimizing a large-scale Android project with over 500 modules.

Keep in mind that these solutions are based on the aforementioned underlying tunings that are already enabled by default.

Remote build cache 😎

Our main project is backed by a huge volume of unit tests, which counts around 75k tests. We run the unit test CI pipeline as a mandatory check for every commit in Pull Requests (PR). In order for the full testsuite to complete, it takes almost one hour and a half🔥🔥. This is certainly an unacceptable amount of time for a single PR check.

As explained earlier, having incremental build and build cache enabled doesn’t cut any second of CI build time. Therefore, we had to deploy a remote build cache system. The core idea remains the same as local build cache, the only difference compared to a local setup is that the cache is stored remotely instead of on local machines. We use S3 as the remote storage to store cache and share it across the builds. This dramatically reduced our unit test execution time from a consistent one hour and a half to a range from several minutes to 40 minutes🚀🚀— depends on how many tests are skipped because of the cache.

While unit test was the original reason we set up this remote build cache, all other builds have benefited as well, leading to a significant overall reduction in build times.

Note: We don’t share this remote cache with local builds for a couple of reasons:

Local builds are already well supported by a rich set of local cache.
Local environments tend to have more noisy or inconsistent builds, and pushing cache from these could degrade the overall quality of the shared remote cache.
While having little benefits for local builds, interacting with remote cache does cost additional money.

You can find details on how to enable remote build cache here, or consider using Develocity solution to help setup and manage it more easily.

Addressing build cache misses 😎

After enabling remote build cache, we still observed an odd behavior where all unit tests in a particular module—the application module specifically— are re-executed even when no changes are introduced in the new builds.

This is when we learned about build cache misses and how they were affecting our builds. We manually diagnosed build cache by utilizing the -Dorg.gradle.caching.debug=true flag, as suggested here. While this is a quite time-consuming and challenging approach, we will show a much simpler alternative a bit later. Through this manual debugging process, we identified 2 libraries as the root cause: NewRelic and Crashlytics. If you are interested in the details, I opened 2 tickets to both repositories: Newrelic and Crashlytics. Unfortunately, neither library has provided a proper fix for the issues so far. We had to implement our own workaround for the Newrelic case, but the Crashlytics issue remains unresolved. Luckily, the Crashlytics problem only affect release builds, which represent a smaller portion of the builds in our workflow.

Develocity 😎

Up until that point, we had been receiving frequent complaints about local builds hanging for dozens of minutes, even half an hour. Unfortunately, we had no visibility on what happened on other people’ machines. Given that we were actively working on improving our build system— including migrating the tools, troubleshooting build issues, optimizing some build pipelines etc— it became clear that we were missing a proper monitoring system to support these works. This is when we decided to run a POC with Develocity, the tool that could address all our needs.

Develocity, formerly known as Gradle Enterprise, is a tool that helps monitor all your builds, either CI or local.

This tool firstly opened a ways simpler approach to debug build cache missing issue. By utilizing this, we discovered numerous other cache misses across different scenarios and resolved them, which ultimately generate an approximate 30%🚀🚀 reduction in build times across all pipelines.

In general, we use Develocity to:

Monitor builds and to get a holistic view about system’s heath: both in local and in CI.
Improve unit tests speed as well as quality with Predictive Test Selection and Flaky Test features.
Investigate issues more effectively. For example, by looking at some of the longest local builds, we found out that the hanging build issue often comes from dependencies fetching— usually due to people not being connected to the company VPN, since some dependencies are hosted in our internal artifactories.
Identify areas for improvement. I have a great example about this that I will talk about in very shortly.
Validate our solutions. With comprehensive build metrics, it's easy to back up changes with clear data—whether it's through simple charts or key numbers.

Build regression pipeline 😎

The example above shows how impactful a cache miss is to the build. That’s why it’s important not only to resolve existing cache misses, but also prevent new ones from getting introduced into the project. Investigating cache misses is time-consuming and we definitely don’t want to do it regularly. Hence, having recurring jobs to run the build validation scripts will help automatically detect cache misses early, ensuring the quality and reliability of our build cache over time.

Here is an example how Netflix did it.

Parallel unit tests 😎

In the previous section, I mentioned that parallel test execution isn’t beneficial in a multi-module project. This is because when running unit tests across the entire project, each module runs its tests in a separate task. And given that the parallel task execution is already enabled, those test tasks will naturally run in parallel anyway.

However, there are still cases where parallel test execution is actually useful. The chart below—taken from the Develocity dashboard— roughly shows what our unit test pipeline looked like at the time:

It is not hard to spot the problem in this chart. The pipeline starts off strong, utilizing all 6 cores in parallel. But as the build progresses, most modules finish quickly—except for two large ones: :payment and :search. These two tasks continue running on just two cores while the remaining cores sit idle.

What a waste of resources!

The solution was to break these 2 heavy modules into smaller chunks— which means we only enabled parallel test execution for :payment and :search. This allows their tests to run across more cores— in other words, in parallel. This way, we could utilize all the available resources to significantly speed up the build.

Parallel test execution for :payment and :search

Here’s what the build looked like after applying the fix:

This piece of work has improved 40%🚀🚀 of the unit test builds involving these 2 modules— excluding cases where the results were already cached. This is a great example of how understanding your project—and having proper monitoring in place—can lead to meaningful improvements in your builds.

Garbage Collector benchmarking 🤔

This suggestion from Google also caught our attention— they recommended trying the Parallel GC. We decided to do a benchmark and compare between G1 GC and ParallelGC.

We used gradle-profiler and setup a CI pipeline to run the benchmark over night (for a benchmark, we run with fresh builds, without cache, so it often take hours for the benchmark to finish).

Here’s an example showing how the Parallel GC performed on one of our pipelines compared to G1 GC:

Based on the results, we didn’t find enough evidence to justify switching to Parallel GC 😞😞. So we continue to stick with G1 GC for now.

Takeaways

There’s no shortcut to optimizing build performance. While Gradle’s default options can help improve 70% of yous builds, the remaining 30% will drag them down if you don’t know how to navigate it. The essentials to overcoming build issues is a deep understanding about the ecosystem:

Learn to effectively use major tools in the Android build system: Gradle, AGP, Languages (Java/Kotlin), JVM
Understand your project from a holistic perspective. This includes setting up proper metrics and monitoring to consistently track the system’s health and spot regression early.
Always experiment and benchmark the impact of a feature on your project before applying them.

Consistent, informed iteration is what leads to meaningful and sustainable performance gains.

Resources

For those who want to learn more about Gradle and AGP, I strongly recommend this book:

Extending Android Builds: Pragmatic Gradle and AGP Skills with Kotlin

Optimizing Large Android Project Builds : When Recommended Tunings Aren't Enough - Part 2