Fix flaky RSpec tests fast with parallel_tests and marsh_grass


One of the hardest parts of debugging and fixing flaky RSpec tests is reproducing the failure reliably, or even reproducing it at all. Several years ago, an experiment at my company produced the marsh_grass gem. This gem provides several tools for debugging flaky tests, including the repetitions
metadata, which will run a test several times in a row. I have found this particular feature useful, particularly on those tests that just refuse to fail consistently. Running the test several times can give an indication of the failure rate, so I can tell if there’s been an improvement.
Until recently, this would still be relatively slow. I’d usually want at least 100 repetitions, and running those sequentially could take several minutes. But this was the best option I had available. Until we set up turbo_tests. Now, I could run my entire test suite locally in parallel, which itself was a big win.
But then I realized something else: turbo_tests brings in parallel_tests, and that also lets you run any command in parallel. Which meant I could set a relatively low number of repetitions and run those repetitions multiple times in parallel. By default, it will run one process per CPU core. My computer has 10 CPU cores, so that means I can run my usual 100 repetitions in just over a tenth of the time.
Here’s how it works. When I’m debugging a flaky test, I’ll first add the repetitions
metadata.
# flaky_spec.rb
RSpec.describe 'flakiness' do
it 'is arbitrarily flaky', repetitions: 10 do
expect(rand(0...100)).to be < 75
end
end
Then, run it with parallel_tests.
RAILS_ENV=test bundle exec parallel_test -e "bundle exec rspec flaky_spec.rb:4"
That’s a lot to remember, so I made a simple script that I put in my PATH (under the name parallel
).
#!/bin/bash
command="bundle exec rspec $@"
RAILS_ENV=test bundle exec parallel_test -e "$command"
Now when I want to run several repetitions of a flaky test in parallel, this is all I have to do:
parallel flaky_spec.rb:4
Pretty easy to remember, and much shorter to type! One word of caution: this doesn’t exit cleanly when interrupted. If your test is driving a browser and you don’t let the process complete normally, you could be left with several browser processes still running that you’ll need to go clean up manually. Other than that, it’s a big win for debugging those pesky flaky tests.
Subscribe to my newsletter
Read articles from Kyle Smith directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
