The one about linting in a legacy Ruby project
Previously, I mentioned that I recently started working on a new project. New for me, of course.
The platform of this American fintech consists of many applications developed in multiple programming languages, but mainly in Ruby. There are several development teams, each with governance over different parts of the project. As you can imagine, over more than a decade literally several hundred developers have worked on the project and each and every one of us has had our say and preferences when it comes to writing code.
There is no defined standard either at the company level or within each team. It is true that there are certain applications that are currently being developed in different teams that very vaguely we try to take as a reference for new developments, mainly to adopt good practices.
It is relevant to mention that my team is relatively new, formed just this year, and that the code in each of the applications over which we have governance is quite different from each other. Generally, they are legacy applications, where adopting a new approach to doing anything is not that simple and even generates panic among people who have been in the company the longest. Luckily, the team has recently created a payouts application where we have more freedom to innovate.
It is precisely in this kind of context where I think that using a linter is the best option.
Whenever starting a new application, my first commits are usually related to the configuration of the testing tool and the linter. However, these applications were not set up correctly from the get-go, so adding a linter definitely will bring a number of challenges.
The first time I talked to the rest of the team about the possibility of adding a linter to our applications to enforce a coding style, the reaction was mostly negative. The main argument was that other teams do not usually use it and that in some of their previous teams within this company they already tried to use it and it did not work. In addition, some of my teammates mentioned that if they had not been using it until now it is because we do not need it and that we should not enforce its use.
The way I see it, those are not solid arguments for not starting to use it, but I did not want to push it any further because I am a newcomer to the team and I knew the time to bring it up again would come.
A few days later, my onboarding buddy included a change in a MR where she added whitespaces after {
and before }
in an existing hash. One of the developers whose opinion is most valued within the company, who is not part of my team, reviewed that code and suggested that to avoid such changes in the future we could use a linter.
Of course, his opinion was taken into account, especially as it was meant to be introduced in the payouts application, where, as I said earlier, there is much more flexibility with the things we can do.
So for the next task I paired with my onboarding buddy and we added the linter as a previous step to other changes we had to implement. We decided to use RuboCop, as it is the most used linter in the Ruby community and the linter I have been working with for a long time.
In the unlikely scenario that you are not familiar with this tool, the official documentation states the following:
RuboCop is a Ruby static code analyzer (a.k.a. linter) and code formatter. Out of the box it will enforce many of the guidelines outlined in the community Ruby Style Guide. Apart from reporting the problems discovered in your code, RuboCop can also automatically fix many of them for you.
RuboCop is extremely flexible and most aspects of its behavior can be tweaked via various configuration options.
To start, once installed, we added the basic configuration in the .rubocop.yml
file, located in the root of the application:
require:
- rubocop-performance
- rubocop-rspec
AllCops:
SuggestExtensions: false
NewCops: enable
TargetRubyVersion: 3.3
Style/Documentation:
Enabled: false
Note that there is no configuration related to Rails because we usually do not work with that framework.
I added the following make commands to run the linter inside the Docker container where the application runs:
lint:
docker compose exec <service> bundle exec rubocop $(if $(COP),--only $(COP))
lint-safe-fix:
docker compose exec <service> bundle exec rubocop -a $(if $(COP),--only $(COP))
lint-unsafe-fix:
docker compose exec <service> bundle exec rubocop -A $(if $(COP),--only $(COP))
lint-group-offenses:
docker compose exec <service> bundle exec rubocop --format offenses
lint-generate-todos:
docker compose exec <service> bundle exec rubocop --auto-gen-config --auto-gen-only-exclude --no-exclude-limit
Next, we ran the linter to check how many offenses exist in the codebase:
$ make lint
Get a list of all offenses grouped by cop:
$ make lint-group-offenses
Usually most offenses are related to quotes (or string literals):
Style/StringLiterals:
Enabled: true
EnforcedStyle: double_quotes
Here you can choose the option you prefer, single or double quotes, and run the command that automatically solves those offenses:
$ make lint-unsafe-fix COP=Style/StringLiterals
Make sure you have a good test suite before you start applying automatic changes with the linter.
Since payouts application is still small, we checked each type of offense one by one.
The next cops I usually configure are related to conditionals:
Style/IfUnlessModifier:
Enabled: false
Style/NegatedIf:
Enabled: false
Style/NegatedUnless:
Enabled: true
Style/UnlessElse:
Enabled: true
Style/IfUnlessModifierOfIfUnless:
Enabled: true
Style/InvertibleUnlessCondition:
Enabled: true
For more details about that configuration, take a look to this other post.
Of course, this is a personal choice and I expected to open a discussion with my teammates in the MR later on regarding the final configuration.
We then repeated the same process of checking the cop with most offenses, configured it as desired and automatically corrected offenses where possible. Not all cops allow to autocorrect offenses.
For those cops I find any offenses, I like to be explicit even if it is to configure the default option of a cop.
At least initially, I also like to have a single place where those files that contain some offenses are temporarily being ignored. Therefore, I include in the .rubocop.yml
file which files are being excluded for each cop:
Metrics/AbcSize:
Enabled: true
CountRepeatedAttributes: false
Exclude:
- app/commands/initiate_payment.rb
Another option, if you want to be even more fine-grained, is to disable cops inline, only in certain parts of a file:
# rubocop:disable Layout/LineLength, Style/StringLiterals
[...]
# rubocop:enable Layout/LineLength, Style/StringLiterals
However, I agree with GitLab on this issue, especially in legacy applications.
Once all offenses have been solved or explicitly disabled, we ensure that no new offenses will be added to the code if the developer adding new changes remembers to run the linter, because we are not enforcing its use neither on the machine of every developer nor in the CI pipeline.
What is the point of configuring a linter if you do not enforce its use?
Thus, in a follow-up task where I refactored much of the code for the payouts application, I added the configuration for the CI pipeline, knowing that it would raise some questions among my teammates once the MR was reviewed. And so it did.
We met and agreed that those changes would be removed from that MR and we would have a dedicated meeting to talk about it later.
A few weeks later we met again and it was up to me to try to convince my team about why it is necessary to have a linter configured and also to enforce its use, especially in the CI pipeline.
My main selling point was that RuboCop provides an initial configuration based on the Ruby style guide, in addition to other community-relevant style guides such as RSpec or Rails, with corresponding dependencies added.
Other relevant arguments largely match those mentioned by Evil Martians:
- Consistent code style, avoiding discussions based on preferences of an individual.
- Onboarding new engineers becomes much easier when the code style is standardized.
- Linters help detect and squash bugs in a timely fashion.
I must say that there was no outright refusal from the team, but they were reluctant about using a linter.
Their arguments against it include the following points, in addition to some previously mentioned:
- In the payouts application we can use it, not in the rest of applications.
- We should not use a linter in all our applications because not all teams use a linter and we do not know if next year the governance will change.
- Do not refactor any of the code if it is not strictly necessary, for things like the cyclomatic complexity of a method or a class, because it would be difficult to detect possible errors in a MR that includes changes in (literally) hundreds or even thousands of files.
- Enforce the use of the linter on the machine of every developer with a git hook so it is not necessary to include it in the CI pipeline.
- Some people do not care about linters, although if we use it they will adapt.
- Do not use
.rubocop-todo.yml
file for reasons very similar to those shared by Evil Martians. - A teammate suggested we could use Standard as a linter to avoid bikeshedding by having to choose which rules to follow.
Of course, I agree with some of those arguments, so my proposal at all times accommodated their arguments, trying to find a point where we all felt comfortable.
In the case of RuboCop vs Standard, I prefer to invest some time initially and directly have the flexibility provided by RuboCop to decide the rules we are gonna follow, adapting to the application at hand, instead of using Standard that under the hood itself relies on RuboCop and does not allow to overwrite rules. I think that we should adapt the configuration to the application at hand, because not always you can apply the same rules to a greenfield application and to a legacy application. I prefer to start with RuboCop's default configuration and adjust the configuration of each cop as needed.
I suggested that I could create a proof of concept adding the linter to one of our larger applications to gauge the effort needed to add it, see what changes needed to be applied and finally decide together if we wanted to go ahead with it. They agreed.
I added the required gems to the Gemfile
and added the RuboCop configuration we already had for the payouts application.
I started again getting the list of all offenses grouped by cop:
$ make lint-group-offenses
The linter found ~32k offenses. More than half related to quotes.
Then I ran the command that automatically fix all offenses, both the safe and unsafe ones:
$ make lint-unsafe-fix
Remember to run the tests every time you change something with the linter. Having a good test suite is the key to success here.
About 60 tests started to fail. All those errors were related to mutated strings by adding the following line to all files:
# frozen_string_literal: true
In some cases it was enough to change String#gsub!
with String#gsub
. In other cases, the encoding of a string was being changed with String#force_encoding
, so we basically had to choose between creating a new unfrozen object with Object#dup
and setting the magic comment to false
.
Once those changes were made, all tests passed correctly.
However, I only did that to check that it was feasible to start using the linter in that application. What I did next was undo all the changes I had made to start creating a commit for each of the cops with the most offenses. Not all of them, because that would take too long.
Some of the cops with the most offenses were the usual suspects:
Style/StringLiterals
Style/FrozenStringLiteralComment
Layout/DotPosition
Layout/EmptyLineAfterMagicComment
Layout/LineLength
Layout/SpaceAfterComma
- ...
For every one of them I followed the same process I had already followed for the payouts application.
- Check the configuration options for the cop on the official RuboCop documentation.
- Correct the errors automatically for that cop, whenever possible.
- Run the tests.
- Fix any tests that might be broken.
- Create a commit for that cop.
As I said before, fixing one by one all offenses would take too long, so all offenses were securely fixed:
$ make lint-safe-fix
With git diff
I checked all the changes made and if for some reason any of them did not work for me, I would undo all changes and repeat the same process as mentioned before.
After that, all offenses were insucurely fixed and the same process was repeated yet again.
Finally, I generated the file with all the remaining offenses:
$ make lint-generate-todos
Exclude the remaining offenses in the .rubocop.yml
file after running the linter again.
Also I configured the git hook that would run just before pushing to the remote repository of each application.
Create following config/git-hooks/pre-push
file:
#!/bin/sh
echo "Running linter..."
if ! make lint; then
echo "Linter detected issues. Push aborted."
exit 1
fi
echo "Linter passed. Proceeding with push."
exit 0
Add config/setup-git-hooks
file, that will run when the application is built with Docker:
#!/bin/sh
cp config/git-hooks/* .git/hooks/
chmod +x .git/hooks/*
echo "Git hooks successfully installed"
Add execute permission to that file with following command:
$ chmod +x config/setup-git-hooks
Bear in mind that git hooks can be skipped:
$ git push --no-verify
Use that option only if you really have a good reason for it.
And that was the proof of concept. When we met again, I told the team everything I had done, clarified some of the questions they had, and we agreed on the following:
- We are going to start using the linter in all our applications.
- Create a parent task on Jira containing a subtask for every application to which the linter needs to be added.
- Use the existing configuration for the payouts application as starting point and in all other cases we use the default RuboCop configuration, unless we have a strong opinion about any of the cops.
- Do not add the linter to the CI pipeline step because we have the git hook. I still think it would be the right thing to do, though.
- Use the
.rubocop-todo.yml
file to maintain the list of offenses that are being ignored, keeping in mind that it can be regenerated as those offenses are resolved. - We are all committed to resolve the offenses little by little.
That is all I can share for now. Tasks are already created and added to our board, but the hardest part is to find the right time to tackle them.
Nobody said that adding a linter to a legacy application is easy, but I am totally convinced that when we have done it, it will be beneficial for all of us on the team and those yet to come. Hopefully this will be the spark for other teams to think about the possibility of using a linter seriously in all their applications.
In any case, keep in mind that here I am only focusing on a style linter and formatter like RuboCop, but there are other kinds of linters, like Reek or Brakeman, that are complementary to RuboCop and that I also consider to be very useful in any application.
Before finishing, I would like to mention that it is quite handy to view the warnings and errors produced by RuboCop as you type code or save changes in your editor or IDE instead of having to run the checks through the command line every time, so configure any of the integrations available to improve your workflow.
Thank you for reading and see you in the next one!
Subscribe to my newsletter
Read articles from David Montesdeoca directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
David Montesdeoca
David Montesdeoca
I love learning new stuff, especially when it comes to building software. I'm really interested in software architecture, clean code, testing and best practices.