commons-lang Explicitly checks string for emptiness to prevent the exception logic overhead

Adding the fail fast checks for empty string into NumberUtils methods (like toInt, toLong, etc.) to prevent a big overhead due to generating exception.

Checking for an empty string is cheap, and a case with an empty string is quite likely.

Jan 09 '23 22:01 fenik17

How much is the overhead and how did you measure it?

Jan 10 '23 00:01 garydgregory

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.03%. Comparing base (ac2062d) to head (76587b3). Report is 942 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff            @@
##             master    #1015   +/-   ##
=========================================
  Coverage     92.03%   92.03%           
  Complexity     7429     7429           
=========================================
  Files           193      193           
  Lines         15672    15672           
  Branches       2899     2899           
=========================================
  Hits          14423    14423           
  Misses          676      676           
  Partials        573      573

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Jan 10 '23 07:01 codecov-commenter

@garydgregory it is minimum 1000 times difference. Benchmark here.

Results on my laptop:

Benchmark                            Mode  Cnt     Score    Error  Units
ToIntBenchmark.toIntNew_emptyString  avgt    5     0,382 ±  0,016  ns/op
ToIntBenchmark.toIntNew_notNumber    avgt    5  1449,161 ± 17,574  ns/op
ToIntBenchmark.toIntNew_number       avgt    5     1,682 ±  0,021  ns/op
ToIntBenchmark.toIntOld_emptyString  avgt    5  1456,267 ± 19,870  ns/op
ToIntBenchmark.toIntOld_notNumber    avgt    5  1443,685 ± 15,124  ns/op
ToIntBenchmark.toIntOld_number       avgt    5     1,683 ±  0,012  ns/op

Exceptions are very expensive.

Jan 10 '23 10:01 fenik17

So... there is one case out of three where the PR helps and the others two make things possibly worse due to the error margins?

Jan 10 '23 10:01 garydgregory

@garydgregory if an empty string is coming (it is often case in my practice), this change can drastically improve performance. In other cases degradation is not noticeable.

Jan 10 '23 11:01 fenik17

That's not what your benchmark shows...

Jan 10 '23 12:01 garydgregory

@garydgregory could you explain, please? I see the difference is within the margin of error:

Benchmark                            Mode  Cnt     Score    Error  Units
ToIntBenchmark.toIntNew_number       avgt    5     1,682 ±  0,021  ns/op
ToIntBenchmark.toIntOld_number       avgt    5     1,683 ±  0,012  ns/op

I understand that additional check can worth some CPU time. But in this case, it's literally fractions of nanoseconds. And this is incommensurable with the possible profit.

Jan 10 '23 12:01 fenik17

Closing: Edge case not worth the clutter. Code is simpler in master now anyway, no need to check for null since the catch will handle it.

Aug 20 '24 16:08 garydgregory

commons-lang commons-lang copied to clipboard

Explicitly checks string for emptiness to prevent the exception logic overhead

Codecov Report

commons-lang
commons-lang copied to clipboard