Speed Up Rust Code Easily with Rayon

riccardoriccardo
3 min read

Why Performance Is Hard

I wanted to speed things up in Rust and let's be honest, threads are one of the best tools to improve performance in Rust..

But using the Tokio crate can be quite unintuitive and difficult to use, all those await and features...

Here's where rayon helps us, allows us to parallelize tasks without having to think about threads. It's simple to use, fast, lightweight, and just works.


A simple definition

Rayon is a library that helps you run code in parallel, making it easy to turn slow, step-by-step computations into faster ones that use multiple CPU cores.

It's a small and easy-to-use tool that lets you add parallelism. It makes sure your code runs safely without data races, and it only uses parallelism when it makes sense, depending on the amount of work at runtime.

For example, we can simply turn this line:

let results: Vec<_> = data.iter().map(|x| x.do_something()).collect();

into:

use rayon::prelude::*;

let results: Vec<_> = data.par_iter().map(|x| x.do_something()).collect();

Using the prelude is the easiest way to do parallelism using rayon in rust.


Let's break down performance

Without rayon

I ran the following code which iterates from 1 to 1,000,000, computes the cube (x.pow(3)) and the square (x.pow(2)) of each number, takes the remainder of both results using modulo 97,531, then sums those two remainders.

I ran it using cargo run without any optimization:

  • Finished dev profile [unoptimized + debuginfo] target(s) in 0.86s

  • Running target\debug\ry.exe

  • 2, 12, 36, 80, 150, 252, 392, 576, 810, 1100

These are the CPU specs:

  • CPU Name: Intel(R) microarchitecture code named Alderlake-S

  • Frequency: 2.5 GHz

  • Logical CPU Count: 12

fn main() {
    let data: Vec<u64> = (1..1_000_000).collect();
    let results: Vec<u64> = data.iter()
        .map(|x| x.pow(3) % 97_531 + x.pow(2) % 97_531)
        .collect();
    println!("{:?}", &results[..10]);
}

I measured performance using Intel Vtune profiler and we can see that without using rayon it needs 0.041s using 1 single thread

and the function which needs more time is the main, because we iterate, calculate and collect the result


With rayon

The computation level is the same as before, but this time we use rayon:

use rayon::prelude::*;

fn main() {
    let data: Vec<u64> = (1..1_000_000).collect();
    let results: Vec<u64> = data.par_iter()
        .map(|x| x.pow(3) % 97_531 + x.pow(2) % 97_531)
        .collect();
    println!("{:?}", &results[..10]);
}

adding rayon = "1.10.0" to your Cargo.toml dependencies

I compiled without optimizations:

  • Finished dev profile [unoptimized + debuginfo] target(s) in 0.02s

  • Running target\debug\ry.exe

  • [2, 12, 36, 80, 150, 252, 392, 576, 810, 1100]

Already now we can see that the program ran in 0.02 seconds, compared to 0.86 seconds without Rayon, but let's see in detail:

  • First, we can see it uses 8 threads instead of just one

  • We see that it took 0.029 seconds instead of 0.041s

  • CPU status is constantly in idle mode instead of poor as before

As before all the effective time is used by one single function which is the last called


When (and When Not) to Use Rayon

The ideal use cases are CPU-bound work, large datasets, pure functions, sorting, etc.

Instead for small workloads, shared mutable state or I/O-heavy tasks it's better to use the Tokio runtime if you really need it. The Tokio module supports fs, time, command execution, net and a lot more using multithreading, but that's another topic I'll write about...


Other stuff Rayon does

Beyond .map and .par_iter Rayon also includes .filter(), .reduce(), .for_each(), join() for parallel sorting


To sum up

Rayon isn't always the best choice. Still, it's a smart and safe way to add parallelism. It helps you scale workloads with minimal code changes, making it a solid choice for performance-critical applications.

๐Ÿ’ก Got another crate in mind?

โ˜• Was this helpful?

Treat me to a coffee on Ko-fi https://ko-fi.com/riccardoadami

0
Subscribe to my newsletter

Read articles from riccardo directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

riccardo
riccardo

Passionate about technology ,software development and embedded. I love writing clean code, learning new things and sharing experiences with the community. Always ready to discover something new, fast learner.