embucket-labs icon indicating copy to clipboard operation
embucket-labs copied to clipboard

Support precision>29 in div0

Open ravlio opened this issue 6 months ago • 0 comments

The Problem: Precision Limitations in DIV0 Function

The issue at hand concerns the DIV0 (divide by zero) function in DataFusion, specifically when dealing with DECIMAL precision greater than 29. The maximum precision in DataFusion is 38

Currently, DIV0 relies on rust_decimal, which is a 96-bit implementation and inherently supports a maximum precision of 29 digits. Consequently, any operation involving DECIMAL values with a precision exceeding 29 will produce incorrect results due to this underlying limitation.

Proposed Solution: Backup with bigdecimal for High Precision

To address this limitation, the proposed solution is to utilize bigdecimal as a backup for DIV0 operations when precision exceeds 29.

Rationale

bigdecimal (presumably referring to BigDecimal from Java, or a similar arbitrary-precision decimal library in Rust) offers arbitrary precision, meaning it can handle numbers with an effectively unlimited number of digits, thus overcoming the 29-digit constraint of rust_decimal. While bigdecimal is estimated to be 10 times slower than rust_decimal, it's currently the most viable option. Implementing manual division logic for arbitrary precision numbers would be significantly more complex and resource-intensive in terms of development. This approach allows DataFusion to correctly handle high-precision DECIMAL divisions in DIV0 by falling back to a slower but more capable library only when necessary, ensuring accuracy without a complete rewrite of the decimal arithmetic from scratch.

ravlio avatar Jun 24 '25 13:06 ravlio