velox
velox copied to clipboard
Add truncate(x,n) presto function
Context: of https://github.com/facebookincubator/velox/issues/2262
Presto supports two versions of truncate: truncate(x) and truncate(x,n). The former one has been added per https://github.com/facebookincubator/velox/pull/2862. This issue add adding implementation for the second overload.
Here is the java reference implementation which must be mimicked: https://github.com/prestodb/presto/blob/ff412ef4b745eb31549cc0abe25e885a63875f58/presto-main/src/main/java/com/facebook/presto/operator/scalar/MathFunctions.java#L365
@gosharz are you going to be working on this Issue as well?
@pedroerp giving it a quick shot here: https://github.com/facebookincubator/velox/pull/2892
Catching up with @mbasmanova: it looks we need to have a closer look at the ways to properly mimick the Java implementation. Main reasons:
- Java implementation relies heavily on BigDecimal support. We need to research c++ alternatives for BigDecimal.
- Naive implementation with using std::pow and std::trunc most probably will introduce a lot of mismatches due to rounding errors and double time limits
Adding references we've collected so far on the 1st point:
- BigDecimal alternative in C++ - https://stackoverflow.com/questions/4798777/is-there-a-c-equivalent-to-javas-bigdecimal
- The GNU Multiple Precision Arithmetic Library - https://gmplib.org/
- The GNU MPFR Library - https://www.mpfr.org/
- Boost.multiprecision- https://www.boost.org/doc/libs/1_72_0/libs/multiprecision/doc/html/index.html
- double-conversion - https://github.com/google/double-conversion
- Postrgres sql trunc implementation - https://github.com/postgres/postgres/blob/40c7fcbbed5d922e905f8032c5035826d0406980/src/backend/utils/adt/numeric.c#L11068
- mysql - https://github.com/mysql/mysql-server/blob/a246bad76b9271cb4333634e954040a970222e0a/sql/item_func.cc#L3469
From https://prestodb.io/docs/current/language/types.html#decimal:
Fixed-Precision[#](https://prestodb.io/docs/current/language/types.html#fixed-precision)
DECIMAL[#](https://prestodb.io/docs/current/language/types.html#decimal)
A fixed precision decimal number. Precision up to 38 digits is supported but performance is best up to 18 digits.
The decimal type takes two literal parameters:
precision - total number of digits
scale - number of digits in fractional part. Scale is optional and defaults to 0.
Example type definitions: DECIMAL(10,3), DECIMAL(20)
Example literals: DECIMAL '10.3', DECIMAL '1234567890', 1.1
Note
For compatibility reasons decimal literals without explicit type specifier (e.g. 1.2) are treated as values of the DOUBLE type by default up to version 0.198. After 0.198 they are parsed as DECIMAL.
System wide property: parse-decimal-literals-as-double
Session wide property: parse_decimal_literals_as_double
Wondering if using one of the above libraries with maximal precision to match type boundaries will guarantee full parity.
From https://en.wikipedia.org/wiki/Floating-point_arithmetic:
| Type | Sign | Exponent | Significand field | Total bits | Exponent bias | Bits precision | Number of decimal digits | |
|---|---|---|---|---|---|---|---|---|
| Half (IEEE 754-2008) | 1 | 5 | 10 | 16 | 15 | 11 | ~3.3 | |
| Single | 1 | 8 | 23 | 32 | 127 | 24 | ~7.2 | |
| Double | 1 | 11 | 52 | 64 | 1023 | 53 | ~15.9 | |
| x86 extended precision | 1 | 15 | 64 | 80 | 16383 | 64 | ~19.2 | |
| Quad | 1 | 15 | 112 | 128 | 16383 | 113 | ~34.0 |
Type Sign Exponent Significand field Total bits Exponent bias Bits precision Number of decimal digits
Half (IEEE 754-2008) 1 5 10 16 15 11 ~3.3
Single 1 8 23 32 127 24 ~7.2
Double 1 11 52 64 1023 53 ~15.9
x86 extended precision 1 15 64 80 16383 64 ~19.2
Quad 1 15 112 128 16383 113 ~34.0
Per https://github.com/facebookincubator/velox/pull/2892