credo icon indicating copy to clipboard operation
credo copied to clipboard

Speed up checks that rely on string parsing by not building intermediate binaries

Open s3cur3 opened this issue 2 months ago • 2 comments

This speeds up the replace_with_spaces functions in the modules that do string parsing to remove certain types of code. On our 3500 file codebase at Jump, any check that relies on these functions is about an order of magnitude slower than checks that operate solely on the AST (they take seconds rather than hundreds of milliseconds), so improving the speed of this string parsing provides a good return.

All told, on our codebase these changes speed up the SpaceInParentheses check by 1.032 seconds on average (27% improvement; 95% confidence interval .963-1.102 sec) and SpaceAfterCommas check by 1.128 ms on average (39.5% improvement; 95% confidence interval 1.082-1.174 sec).

Raw data on 20 runs comparing the changed version to master: comparison.csv

s3cur3 avatar Dec 06 '25 14:12 s3cur3

I am in favor of this and have no problems with the slight change in handling heredocs.

One thing that worries me: I tested this on a couple of small projects (~300 files) and it gets around 10% slower with these changes.

I am wondering if this is due to my setup or if we have an explanation for this? Maybe I can pin point which checks are getting slower.

rrrene avatar Dec 09 '25 19:12 rrrene

@rrrene Interesting! Here's the patch I've been using the profile the runtimes of individual checks. I wonder if this is a pessimization on a check I'm not using. 🤔

s3cur3 avatar Dec 09 '25 22:12 s3cur3

This looks good. I will merge it before the "x-mas release" 👍

rrrene avatar Dec 15 '25 19:12 rrrene