velox icon indicating copy to clipboard operation
velox copied to clipboard

Support decimal operation precision not loss mode

Open jinchengchenghh opened this issue 1 year ago • 15 comments

Each of the decimal operation functions is registered as two functions such as add_deny_precision_loss and add. When allowing precision loss, establishing the result type of an arithmetic operation happens according to Hive behavior and SQL ANSI 2011 specification, i.e. rounding the decimal part of the result if an exact representation is not possible. Otherwise, NULL is returned in those cases, as previously. When not allowing precision loss, not rounding the decimal part.

For example,

  adding decimal(38, 7) and decimal(10, 0) result type adding 1.1232154 and 1
allow precision loss decimal(38, 6) 2.123215
deny precision loss decimal(38, 7) 2.1232154

Spark implementation: https://github.com/apache/spark/blob/branch-3.5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala#L814

jinchengchenghh avatar Jul 03 '24 02:07 jinchengchenghh

Deploy Preview for meta-velox canceled.

Name Link
Latest commit 412e0041377b01950b74b8f9d0fd2313169b176e
Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/67315cd274896d0008840ec8

netlify[bot] avatar Jul 03 '24 02:07 netlify[bot]

Can you help review this PR? Thanks! @rui-mo @PHILO-HE

jinchengchenghh avatar Jul 03 '24 06:07 jinchengchenghh

hi @rui-mo @PHILO-HE Can I take your time to help review this PR?

jinchengchenghh avatar Jul 09 '24 01:07 jinchengchenghh

Thanks for your comments, I have address all the comments. @rui-mo

jinchengchenghh avatar Jul 09 '24 03:07 jinchengchenghh

@jinchengchenghh The PR is used by oap/velox, can you follow up?

FelixYBW avatar Jul 25 '24 04:07 FelixYBW

Address all the comments, can you help review again? @rui-mo

jinchengchenghh avatar Jul 31 '24 05:07 jinchengchenghh

Do you have any more comments? @rui-mo

jinchengchenghh avatar Aug 28 '24 05:08 jinchengchenghh

As this discussion https://github.com/facebookincubator/velox/issues/10467, we cannot set it at runtime. @rui-mo

jinchengchenghh avatar Aug 30 '24 02:08 jinchengchenghh

Any more comments? Thanks! @rui-mo

jinchengchenghh avatar Sep 03 '24 08:09 jinchengchenghh

Soft reminder, do you have any more comment? @rui-mo

jinchengchenghh avatar Sep 10 '24 01:09 jinchengchenghh

Talked with @jinchengchenghh offline, and she will help update this PR to support session-level precision loss mode.

rui-mo avatar Sep 10 '24 02:09 rui-mo

Updated, thanks! @rui-mo

jinchengchenghh avatar Sep 13 '24 03:09 jinchengchenghh

Addressed all the comments, could you help review again? Thanks! @rui-mo

jinchengchenghh avatar Oct 14 '24 05:10 jinchengchenghh

cc: @mbasmanova If you have any more comment. Thanks!

rui-mo avatar Oct 23 '24 03:10 rui-mo

Yes, thanks for your kindly review. I forgot to update the PR description when the implement was changed, now it is registered as different names as we discussed in issue https://github.com/facebookincubator/velox/issues/10467#issuecomment-2339486865 Updated and added the example in PR description. @mbasmanova

jinchengchenghh avatar Oct 24 '24 02:10 jinchengchenghh

Do you have further comments? Thanks! @mbasmanova

jinchengchenghh avatar Nov 04 '24 06:11 jinchengchenghh

@jinchengchenghh

Otherwise, NULL is returned in those cases, as previously.

What does this refer to? Isn't current before for an operation to fail? Would you provide an example where it returns NULL?

mbasmanova avatar Nov 05 '24 14:11 mbasmanova

When not allowing precision loss, not rounding the decimal part.

It might be helpful to clarify the tradeoff of using allowing vs. not allowing precision loss. Otherwise, one may wonder why would we ever what to allow precision loss.

I assume some operations success if precision loss is allowed and fail if not. Would be nice to add such examples to PR description and docs.

mbasmanova avatar Nov 05 '24 14:11 mbasmanova

allow-precision-loss name might be confusing because 'precision' in 'allow-precision-loss' is not the same as 'precision' in decimal type. It might be helpful to clarify some more to avoid confusion.

mbasmanova avatar Nov 05 '24 14:11 mbasmanova

@jinchengchenghh

Otherwise, NULL is returned in those cases, as previously.

What does this refer to? Isn't current before for an operation to fail? Would you provide an example where it returns NULL?

Yes, overflow may result in NULL. This description origins from Spark config description. I agree it is more likely return null for some edge cases when not allowing precision loss, I tried before to calculate the edge case, but I could not find an example, I will scan Spark test cases to try again. https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L3373

jinchengchenghh avatar Nov 06 '24 01:11 jinchengchenghh

Added the example and update document, could you help review again? Thanks! @mbasmanova

jinchengchenghh avatar Nov 06 '24 02:11 jinchengchenghh

Updated the document, could you help review again? Thanks! @mbasmanova

jinchengchenghh avatar Nov 07 '24 03:11 jinchengchenghh

@kevinwilfong has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Nov 07 '24 18:11 facebook-github-bot

@jinchengchenghh could you rebase, and update the PR?

There are a lot of broken builds, but I know there was a bad merge yesterday, so it's probably due to that.

kevinwilfong avatar Nov 08 '24 18:11 kevinwilfong

Rebased, and all the tests passed, thanks! @kevinwilfong

jinchengchenghh avatar Nov 11 '24 05:11 jinchengchenghh

@kevinwilfong has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Nov 11 '24 17:11 facebook-github-bot

@kevinwilfong merged this pull request in facebookincubator/velox@3884939853312782e5252084f28cd3f806d705fe.

facebook-github-bot avatar Nov 12 '24 18:11 facebook-github-bot

Conbench analyzed the 1 benchmark run on commit 38849398.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

conbench-facebook[bot] avatar Nov 12 '24 18:11 conbench-facebook[bot]