Speed up checks that rely on string parsing by not building intermediate binaries
This speeds up the replace_with_spaces functions in the modules that do string parsing to remove certain types of code. On our 3500 file codebase at Jump, any check that relies on these functions is about an order of magnitude slower than checks that operate solely on the AST (they take seconds rather than hundreds of milliseconds), so improving the speed of this string parsing provides a good return.
All told, on our codebase these changes speed up the SpaceInParentheses check by 1.032 seconds on average (27% improvement; 95% confidence interval .963-1.102 sec) and SpaceAfterCommas check by 1.128 ms on average (39.5% improvement; 95% confidence interval 1.082-1.174 sec).
Raw data on 20 runs comparing the changed version to master: comparison.csv
I am in favor of this and have no problems with the slight change in handling heredocs.
One thing that worries me: I tested this on a couple of small projects (~300 files) and it gets around 10% slower with these changes.
I am wondering if this is due to my setup or if we have an explanation for this? Maybe I can pin point which checks are getting slower.
@rrrene Interesting! Here's the patch I've been using the profile the runtimes of individual checks. I wonder if this is a pessimization on a check I'm not using. 🤔
This looks good. I will merge it before the "x-mas release" 👍