The Story

Make Custom Rules In Rubocop

Posted on September 2, 2021

Hi guys,

My name is Eguchi, I am currently working as a server engineer at MoneyForward.

I recently accepted the challenge of adding Rubocop custom rules into a product under development.

I would like to share some of the key findings I conducted while working on them with you.

In order to make this a good read even for those who know nothing about Rubocop, we‘ll review the general outline before getting to the point!

So, what is Rubocop?

Every programmer follows his or her own set of standards while writing programs, whether for clarity or efficiency. The following rules should be followed in any program, even if they are not expressly stated on purpose.

  • Align indentation
  • Don‘t use the variable that has not been used yet
  • Don‘t branch the same “if” statement more than once

Rubocop is a tool that checks Ruby programs for compliance with those rules. By Rubocop‘s checks when making changes, you can avoid some of the problems of your program. 

Especially, if you are developing a program with a team, it will lighten the workload of reviewers. Some rules can also be automatically modified, saving you the time and effort of manually fixing them.

How to use Rubocop 

Let‘s try to write a simple program that includes variables that have not been used.

Then input those to Rubucob command 

This command analyzes the contents of the file and returns the lines that have issues. Below is the results of an actual test. You can see those unused variables were detected correctly.

Prototype of Custom Cop 

In order to create custom rules (Custom Cop) in Rubocop, you need to create a class that satisfies certain interfaces.

  • Create a class that inherits from RuboCop::Cop::Base
  • Define a constant MSG for error messages
  • Implement the on_xxx hook in that class (method to be executed when parsing)
    • Execute add_offense method when the violation occurs.

And here is a practical example: 

If this class violates a certain rule, it returns a “Not good “ message. To implement the rules, you need to understand Rubocop parsing and pattern matching.

When building this prototype, you can use the “rake command” that is built into the gem on GitHub called [rubocop-extension-generator] to generate a Custom Cop template. But this isn‘t used here because I think a Rubocop plugin that can be used for multiple purposes will be made. 

Parsing of Rubocop 

RuboCop uses a gem called parser which is a library that reads Ruby scripts and generates an abstract syntax tree (AST). You can test out the parser‘s executable command ruby-parse as follows.

This means that the Ruby script with a single value is represented by a single node (int 1) in the Abstract Syntax Tree. A parser expresses an AST node with parenthesis. Then, it outputs the type of the node at the beginning of the parentheses. Afterward, one or more values are assigned.

(Type Value 1 of node ...)                                                                                                                                                                                                 

Let‘s take a look at some examples.

Node types include integers, strings, method calls, variable assignments, variable references, constants, and function declarations. Ruby-parse, by the way, does not have to be written as a one-liner and can accept files. As a result, try to provide ruby-parser with an appropriate ruby file that you have on hand. Then, you will see that regardless of how complex your program is, AST will be correctly constructed.

Pattern Matching in Rubocop

Rubocop uses an expression called NodePattern in order to pattern-match with the AST. This is like a regular expression for ASTs. A regular expression is a string that matches another string, but with NodePattern it is a string that matches an AST.

For example, "send" is one of the most basic NodePatterns. This pattern will match the node of a method call. Let‘s see if it matches with "send" once we provide an appropriate program.

The pattern "send" does not match any integer literal or constant. This is because it is not a method call. On the other hand, "send" will match with a length call method on a string literal. Similarly, 1 + 1 will match because this program calls the method +.

Like "send", "int" and "const" are among the shortest NodePatterns.

Let‘s have a look at some more complicated patterns. When the pattern wrapped in parentheses matches the string representation of the AST, it returns true.

Any part of a node that is not a concern can be matched with an arbitrary element by using...

The first pattern "(int ...)" matches all integer literals; the second pattern "(send ... :length)" matches the call to the method length. It will match with any receiver. The last example does not match since it does not call the length method. 

You can take out some of the matched code by using $.

The first example returns both the receiver of the method call and the method name; the second returns only the method name; and the last returns only the receiver.

It is important to note that the lower-case "s" in the output here indicates the AST node in its internal representation.

The implementation of Custom Cop

Let‘s try to make a rule that prohibits [!array.empty?] using the pattern matching we‘ve learned so far. The reason for such a prohibition is that it can be expressed in a shorter code [array.any]. How about a NodePattern that matches [!array.empty]? The code must be a call to the [empty] method. Then, we can see what is not related to the receiver and the expression of (send (send (...)) empty?) :!)

A Custom Cop that uses this pattern to determine something would look like the following.

[def_node_matcher] takes the first argument as the method name and the second argument as the NodePattern. It then defines a method to determine if the pattern has been matched or not. The defined method is used to make the judgment inside on_send.

The constant RESTRICT_ON_SEND is intended to be a special array for optimization purposes. It limits on_send to being executed only when a method contained in it is called. Without this limitation, the execution time would increase due to calling on_send for all methods and calculating the pattern match. In this case, we could reduce execution time if on_send is performed only when the outermost method is found.

The Custom Cop that you have defined should be saved in ./lib/rubocop/cop/style/simplify_not_empty_with_any.rb at this time.Now it is ready. In order to check, create a test file test.rb that intentionally violates Custom Cop. The contents of it are not particularly meaningful, but we have to make sure there are no violations other than Custom Cop.

Then, execute the following command.

rubocop test.rb --require ./lib/rubocop/cop/style/simplify_not_empty_with_any.rb                                                             

Below are the results. We can see that the custom rule Style/SimplifyNotEmptyWithAny has been inspected and the violations have been found.

Since it is troublesome to write the require every time, add the following contents to the configuration file .rubocop.yml.

Simply running the rubocop command without options will make Custom Cop run every time.

This makes Auto-correct compatible with Custom Cop

Some of the Cop build-ins have an auto-correct function that can automatically correct violations by supplying an argument when rubocop is executed.

Let‘s make auto-correct compatible with Style/SimplifyNotEmptyWithAny that we defined earlier. Add two modifications to Custom Cop.

  • extend RuboCop::Cop::AutoCorrector module
  • Fix the source code of the violation by giving a block to the method add_offence

Use the $(...) in NodePattern to remove the receiver and assign it to the variable matching. Then use that receiver to replace the source code from [!array.empty?] to [array.any?] in the add_offence block. Then use [Parser::Source::TreeRewriter] because source code replacement must be done in the syntax tree.

Mostly, we use the following methods.

It should be noted that for node is by AST node and for content is by ruby program string. The following is an example of its use.

Let‘s run the rubocop command similarly to the previous section.

The error changes, and the label [Correctable] is also added. Auto-correctable is appended to the end of the message. Then keep running the rubocop --auto-correct command.

The automatic correction worked correctly. The contents of the file are also properly replaced.

Conclusion

We have introduced how to use Rubocop to create a Custom Cop, how to apply it, and how to add an automatic modification function to it. If you have made it this far, you may be able to make a pull request to have the original Rubocop that captured Custom Cop. For more detailed information on how to write tests for Custom Cop and how to develop it by using gem, please read the articles in the Rubocop documentation.

Translator: Michael