[SPARK-40465][SQL] Refactor Decimal so as we can use other underlying implementation
What changes were proposed in this pull request?
The background
Currently, Spark Decimal use the java.math.BigDecimal as the underlying implementation.
As we know, the computing and storage efficiency of Spark Decimal is very low.
This PR let Decimal can supports other underlying implementation.
The design
The refactoring design of Spark Decimal is mainly based on the proxy mode - that is, to create the abstraction class DecimalOperation, to lower the current implementation of Spark Decimal based on java.math.BigDecimal to JDKDecimalOperation. The API of Spark Decimal remains unchanged and proxies the corresponding interface methods of DecimalOperation. The whole design is shown in the figure below.

Whether Spark Decimal will proxy JDKDecimalOperation or other subclasses of DecimalOperation depends on the configuration parameter spark.sql.decimal.implementation. The default value of spark.sql.decimal.implementation is JDKBigDecimal, that is, Decimal will proxy JDKDecimalOperation. spark.sql.decimal.implementation can also be set as other possible optimization implementations.
Benchmark Before this refactor, the decimal benchmark show below.
[info] running (fork) org.apache.spark.sql.DecimalBenchmark
[info] Running benchmark: Decimal For bigDecimal and smallDecimal
[info] Running case: Operator +
[info] Stopped after 2 iterations, 30342 ms
[info] Running case: Operator -
[info] Stopped after 2 iterations, 33470 ms
[info] Running case: Operator *
[info] Stopped after 2 iterations, 36113 ms
[info] Running case: Operator /
[info] Stopped after 2 iterations, 62917 ms
[info] Running case: Operator %
[info] Stopped after 2 iterations, 28986 ms
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_311-b11 on Mac OS X 10.16
[info] Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
[info] Decimal For bigDecimal and smallDecimal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Operator + 15105 15171 94 8.9 112.5 1.0X
[info] Operator - 16134 16735 850 8.3 120.2 0.9X
[info] Operator * 17650 18057 576 7.6 131.5 0.9X
[info] Operator / 30421 31459 1468 4.4 226.7 0.5X
[info] Operator % 14342 14493 213 9.4 106.9 1.1X
[info] Running benchmark: Decimal For smallDecimal and bigDecimal
[info] Running case: Operator +
[info] Stopped after 2 iterations, 36101 ms
[info] Running case: Operator -
[info] Stopped after 2 iterations, 36972 ms
[info] Running case: Operator *
[info] Stopped after 2 iterations, 29374 ms
[info] Running case: Operator /
[info] Stopped after 2 iterations, 86810 ms
[info] Running case: Operator %
[info] Stopped after 2 iterations, 153669 ms
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_311-b11 on Mac OS X 10.16
[info] Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
[info] Decimal For smallDecimal and bigDecimal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Operator + 17120 18051 1316 7.8 127.6 1.0X
[info] Operator - 17716 18486 1089 7.6 132.0 1.0X
[info] Operator * 14505 14687 258 9.3 108.1 1.2X
[info] Operator / 42314 43405 1543 3.2 315.3 0.4X
[info] Operator % 75702 76835 1601 1.8 564.0 0.2X
[info] Running benchmark: Decimal For smallDecimal and smallDecimal
[info] Running case: Operator +
[info] Stopped after 2 iterations, 5794 ms
[info] Running case: Operator -
[info] Stopped after 2 iterations, 5822 ms
[info] Running case: Operator *
[info] Stopped after 2 iterations, 31043 ms
[info] Running case: Operator /
[info] Stopped after 2 iterations, 66493 ms
[info] Running case: Operator %
[info] Stopped after 2 iterations, 56967 ms
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_311-b11 on Mac OS X 10.16
[info] Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
[info] Decimal For smallDecimal and smallDecimal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] -------------------------------------------------------------------------------------------------------------------------
[info] Operator + 2837 2897 84 47.3 21.1 1.0X
[info] Operator - 2897 2911 21 46.3 21.6 1.0X
[info] Operator * 14574 15522 1340 9.2 108.6 0.2X
[info] Operator / 32050 33247 1693 4.2 238.8 0.1X
[info] Operator % 28359 28484 177 4.7 211.3 0.1X
[info] Running benchmark: Decimal For bigDecimal and bigDecimal
[info] Running case: Operator +
[info] Stopped after 2 iterations, 4692 ms
[info] Running case: Operator -
[info] Stopped after 2 iterations, 4850 ms
[info] Running case: Operator *
[info] Stopped after 2 iterations, 31087 ms
[info] Running case: Operator /
[info] Stopped after 2 iterations, 64451 ms
[info] Running case: Operator %
[info] Stopped after 2 iterations, 58293 ms
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_311-b11 on Mac OS X 10.16
[info] Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
[info] Decimal For bigDecimal and bigDecimal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Operator + 2332 2346 20 57.6 17.4 1.0X
[info] Operator - 2384 2425 59 56.3 17.8 1.0X
[info] Operator * 15188 15544 503 8.8 113.2 0.2X
[info] Operator / 31247 32226 1384 4.3 232.8 0.1X
[info] Operator % 28878 29147 380 4.6 215.2 0.1X
[success] Total time: 1369 s (22:49), completed 2022-9-21 19:42:19
After this refactor, the decimal benchmark show below.
[info] running (fork) org.apache.spark.sql.DecimalBenchmark
[info] Running benchmark: Decimal For bigDecimal and smallDecimal
[info] Running case: Operator +
[info] Stopped after 2 iterations, 34143 ms
[info] Running case: Operator -
[info] Stopped after 2 iterations, 33796 ms
[info] Running case: Operator *
[info] Stopped after 2 iterations, 33862 ms
[info] Running case: Operator /
[info] Stopped after 2 iterations, 65502 ms
[info] Running case: Operator %
[info] Stopped after 2 iterations, 31809 ms
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_311-b11 on Mac OS X 10.16
[info] Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
[info] Decimal For bigDecimal and smallDecimal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Operator + 17040 17072 44 7.9 127.0 1.0X
[info] Operator - 16891 16898 10 7.9 125.8 1.0X
[info] Operator * 16884 16931 66 7.9 125.8 1.0X
[info] Operator / 32673 32751 111 4.1 243.4 0.5X
[info] Operator % 15901 15905 5 8.4 118.5 1.1X
[info] Running benchmark: Decimal For smallDecimal and bigDecimal
[info] Running case: Operator +
[info] Stopped after 2 iterations, 36061 ms
[info] Running case: Operator -
[info] Stopped after 2 iterations, 40781 ms
[info] Running case: Operator *
[info] Stopped after 2 iterations, 39561 ms
[info] Running case: Operator /
[info] Stopped after 2 iterations, 92115 ms
[info] Running case: Operator %
[info] Stopped after 2 iterations, 148947 ms
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_311-b11 on Mac OS X 10.16
[info] Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
[info] Decimal For smallDecimal and bigDecimal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Operator + 17977 18031 76 7.5 133.9 1.0X
[info] Operator - 19280 20391 1571 7.0 143.6 0.9X
[info] Operator * 19406 19781 530 6.9 144.6 0.9X
[info] Operator / 43565 46058 NaN 3.1 324.6 0.4X
[info] Operator % 74361 74474 160 1.8 554.0 0.2X
[info] Running benchmark: Decimal For smallDecimal and smallDecimal
[info] Running case: Operator +
[info] Stopped after 2 iterations, 8212 ms
[info] Running case: Operator -
[info] Stopped after 2 iterations, 6694 ms
[info] Running case: Operator *
[info] Stopped after 2 iterations, 31335 ms
[info] Running case: Operator /
[info] Stopped after 2 iterations, 69401 ms
[info] Running case: Operator %
[info] Stopped after 2 iterations, 61130 ms
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_311-b11 on Mac OS X 10.16
[info] Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
[info] Decimal For smallDecimal and smallDecimal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] -------------------------------------------------------------------------------------------------------------------------
[info] Operator + 4101 4106 7 32.7 30.6 1.0X
[info] Operator - 3311 3347 52 40.5 24.7 1.2X
[info] Operator * 15608 15668 84 8.6 116.3 0.3X
[info] Operator / 34685 34701 23 3.9 258.4 0.1X
[info] Operator % 30446 30565 168 4.4 226.8 0.1X
[info] Running benchmark: Decimal For bigDecimal and bigDecimal
[info] Running case: Operator +
[info] Stopped after 2 iterations, 8157 ms
[info] Running case: Operator -
[info] Stopped after 2 iterations, 7015 ms
[info] Running case: Operator *
[info] Stopped after 2 iterations, 31871 ms
[info] Running case: Operator /
[info] Stopped after 2 iterations, 70469 ms
[info] Running case: Operator %
[info] Stopped after 2 iterations, 66660 ms
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_311-b11 on Mac OS X 10.16
[info] Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
[info] Decimal For bigDecimal and bigDecimal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Operator + 4067 4079 16 33.0 30.3 1.0X
[info] Operator - 3381 3508 180 39.7 25.2 1.2X
[info] Operator * 15906 15936 42 8.4 118.5 0.3X
[info] Operator / 35206 35235 41 3.8 262.3 0.1X
[info] Operator % 32764 33330 801 4.1 244.1 0.1X
[success] Total time: 1446 s (24:06), completed 2022-9-21 19:16:59
We can see the proxy mode have the overhead to create DecimalOperation instances.
Why are the changes needed?
Refactor Decimal so as we can use other underlying implementation.
Does this PR introduce any user-facing change?
'No'.
The interface of Decimal is not changed.
How was this patch tested?
N/A
@beliefer thanks for bring this up!
cc @cloud-fan would you pelease review this PR.
This PR can ensure that the existing underlying implementation of Decimal will not be changed, and it also provides an interface for us to develop new implementations through iteration. This refactoring helps smooth migration of new functions
The abstraction makes sense to me. The new DecimalOperation layer adds an indirection and may hurt performance, though I think the overhead is minor as Java decimal operation is already slow. Can we add a microbenchmark for Decimal? You can follow HashBenchmark.
The abstraction makes sense to me. The new
DecimalOperationlayer adds an indirection and may hurt performance, though I think the overhead is minor as Java decimal operation is already slow. Can we add a microbenchmark forDecimal? You can followHashBenchmark.
I followed HashBenchmark and create a new benchmark DecimalBenchmark. Also, I add the output of benchmark into PR description.
ping @cloud-fan would you pelease review this PR.
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!