rubocop-performance
rubocop-performance copied to clipboard
Detect regexes where a string could be used in .split
Is your feature request related to a problem? Please describe.
I've noticed that sometimes regexes are used when a simple string would work fine. The code below could / should be simplified to be just a string
so.split(/:/)
Describe the solution you'd like
Autocorrect to
so.split(':')
This is more clear to anyone reading the code and it's 3x faster:
Warming up --------------------------------------
Regex Split 94.649k i/100ms
String Split 290.261k i/100ms
Calculating -------------------------------------
Regex Split 939.299k (± 2.0%) i/s - 4.732M in 5.040237s
String Split 2.877M (± 1.7%) i/s - 14.513M in 5.046006s
Comparison:
String Split: 2877000.8 i/s
Regex Split: 939298.7 i/s - 3.06x (± 0.00) slower
Describe alternatives you've considered
Not this time
I think we have handled some code similar to this in the Performance cops. Specifically, I think Performance/StringReplacement
does some basic checks to see if we can swap regex for a string. I vaguely remember looking into something like this a while back. I think one of the concerns was that the performance of regex vs string changed based on the Ruby version.
@rrosenblum Thanks for the suggestion. I will transfer this issue to rubocop-performance because https://github.com/rubocop-hq/rubocop-performance/pull/190 has been opened.
I'm not seeing it:
user system total real
String#split(//) 0.765812 0.000649 0.766461 ( 0.766922)
String#split('') 0.809555 0.000397 0.809952 ( 0.810363)
require "benchmark"
MAX = 2000000
Benchmark.bm(30) do |x|
x.report("String#split(//)"){
string = "hello"
MAX.times{ string.split(//) }
}
x.report("String#split('')"){
string = "hello"
MAX.times{ string.split('') }
}
end
@djberg96 what ruby release are you running. Here are my results on 2.7.4 from your benchmark snippet:
user system total real
String#split(//) 11.726928 0.100159 11.827087 ( 11.910536)
String#split('') 2.162106 0.020655 2.182761 ( 2.202049)
@tas50 That was 3.0.1.
I just tried 2.7.2 and 2.6.6 as well. The 2.7.x results were similar to yours, whereas the 2.6.x results showed both much slower, but the regex was slightly faster.
We've known for a while that the performance measurements for some of the cops can change across Ruby versions. I don't think we document the performance benefits based on Ruby version. Maybe one solution to this that we start documenting performance across Ruby versions, and we start to use that information for targeting whether a cop is enabled or not.
benchmark-ips is the recommended tool for generating performance reports for snippets of code
Many of the cops are based on snippets from fast-ruby. Unfortunately, this project doesn't appear to be well maintained anymore. At one point in time, they did have a build that would show off performance for different versions of Ruby