Performance Ideas
Just random thoughts about how to get more performance out of scc. Especially since tokei is now faster :(
One thought I had was to do some hand rolled state machines for the common languages. The core loop is designed around dealing with any language. This means however it is very generic, which usually means its doing more than it needs to.
It would be interesting to try building either a state machine based on the JSON logic OR using a hand rolled hard-coded state machine for a few of the more common languages. Java might be a good place to start because its a fairly easy language to deal with and there is a nice performance test in place to evaluate how this works.
For the common languages this shouldn't be too hard, and it can always fall back to the generic logic if required.
Playing around with this just for the most basic case of languages with no complexity, comments etc...
$ hyperfine -m 20 './scc' 'scc'
Benchmark #1: ./scc
Time (mean ± σ): 600.7 ms ± 53.9 ms [User: 2.143 s, System: 2.415 s]
Range (min … max): 516.3 ms … 785.3 ms 20 runs
Benchmark #2: scc
Time (mean ± σ): 674.6 ms ± 62.1 ms [User: 2.810 s, System: 2.545 s]
Range (min … max): 599.5 ms … 922.6 ms 20 runs
Summary
'./scc' ran
1.12 ± 0.14 times faster than 'scc'
Pretty good gains for about 30 minutes of work. Ill have to try adding specific language ones to determine how well they work but this seems like the logical place to get that next level of performance.
https://github.com/boyter/scc/pull/232
Other thoughts https://github.com/boyter/scc/issues/245