commons-lang icon indicating copy to clipboard operation
commons-lang copied to clipboard

Explicitly checks string for emptiness to prevent the exception logic overhead

Open fenik17 opened this issue 2 years ago • 7 comments

Adding the fail fast checks for empty string into NumberUtils methods (like toInt, toLong, etc.) to prevent a big overhead due to generating exception.

Checking for an empty string is cheap, and a case with an empty string is quite likely.

fenik17 avatar Jan 09 '23 22:01 fenik17

How much is the overhead and how did you measure it?

garydgregory avatar Jan 10 '23 00:01 garydgregory

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.03%. Comparing base (ac2062d) to head (76587b3). Report is 942 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #1015   +/-   ##
=========================================
  Coverage     92.03%   92.03%           
  Complexity     7429     7429           
=========================================
  Files           193      193           
  Lines         15672    15672           
  Branches       2899     2899           
=========================================
  Hits          14423    14423           
  Misses          676      676           
  Partials        573      573           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jan 10 '23 07:01 codecov-commenter

@garydgregory it is minimum 1000 times difference. Benchmark here.

Results on my laptop:

Benchmark                            Mode  Cnt     Score    Error  Units
ToIntBenchmark.toIntNew_emptyString  avgt    5     0,382 ±  0,016  ns/op
ToIntBenchmark.toIntNew_notNumber    avgt    5  1449,161 ± 17,574  ns/op
ToIntBenchmark.toIntNew_number       avgt    5     1,682 ±  0,021  ns/op
ToIntBenchmark.toIntOld_emptyString  avgt    5  1456,267 ± 19,870  ns/op
ToIntBenchmark.toIntOld_notNumber    avgt    5  1443,685 ± 15,124  ns/op
ToIntBenchmark.toIntOld_number       avgt    5     1,683 ±  0,012  ns/op

Exceptions are very expensive.

fenik17 avatar Jan 10 '23 10:01 fenik17

So... there is one case out of three where the PR helps and the others two make things possibly worse due to the error margins?

garydgregory avatar Jan 10 '23 10:01 garydgregory

@garydgregory if an empty string is coming (it is often case in my practice), this change can drastically improve performance. In other cases degradation is not noticeable.

fenik17 avatar Jan 10 '23 11:01 fenik17

That's not what your benchmark shows...

garydgregory avatar Jan 10 '23 12:01 garydgregory

@garydgregory could you explain, please? I see the difference is within the margin of error:

Benchmark                            Mode  Cnt     Score    Error  Units
ToIntBenchmark.toIntNew_number       avgt    5     1,682 ±  0,021  ns/op
ToIntBenchmark.toIntOld_number       avgt    5     1,683 ±  0,012  ns/op

I understand that additional check can worth some CPU time. But in this case, it's literally fractions of nanoseconds. And this is incommensurable with the possible profit.

fenik17 avatar Jan 10 '23 12:01 fenik17

Closing: Edge case not worth the clutter. Code is simpler in master now anyway, no need to check for null since the catch will handle it.

garydgregory avatar Aug 20 '24 16:08 garydgregory