velox icon indicating copy to clipboard operation
velox copied to clipboard

Add truncate(x,n) presto function

Open gosharz opened this issue 3 years ago • 2 comments
trafficstars

Context: of https://github.com/facebookincubator/velox/issues/2262

Presto supports two versions of truncate: truncate(x) and truncate(x,n). The former one has been added per https://github.com/facebookincubator/velox/pull/2862. This issue add adding implementation for the second overload.

Here is the java reference implementation which must be mimicked: https://github.com/prestodb/presto/blob/ff412ef4b745eb31549cc0abe25e885a63875f58/presto-main/src/main/java/com/facebook/presto/operator/scalar/MathFunctions.java#L365

gosharz avatar Oct 19 '22 14:10 gosharz

@gosharz are you going to be working on this Issue as well?

pedroerp avatar Oct 20 '22 01:10 pedroerp

@pedroerp giving it a quick shot here: https://github.com/facebookincubator/velox/pull/2892

gosharz avatar Oct 20 '22 01:10 gosharz

Catching up with @mbasmanova: it looks we need to have a closer look at the ways to properly mimick the Java implementation. Main reasons:

  1. Java implementation relies heavily on BigDecimal support. We need to research c++ alternatives for BigDecimal.
  2. Naive implementation with using std::pow and std::trunc most probably will introduce a lot of mismatches due to rounding errors and double time limits

Adding references we've collected so far on the 1st point:

  1. BigDecimal alternative in C++ - https://stackoverflow.com/questions/4798777/is-there-a-c-equivalent-to-javas-bigdecimal
  2. The GNU Multiple Precision Arithmetic Library - https://gmplib.org/
  3. The GNU MPFR Library - https://www.mpfr.org/
  4. Boost.multiprecision- https://www.boost.org/doc/libs/1_72_0/libs/multiprecision/doc/html/index.html
  5. double-conversion - https://github.com/google/double-conversion
  6. Postrgres sql trunc implementation - https://github.com/postgres/postgres/blob/40c7fcbbed5d922e905f8032c5035826d0406980/src/backend/utils/adt/numeric.c#L11068
  7. mysql - https://github.com/mysql/mysql-server/blob/a246bad76b9271cb4333634e954040a970222e0a/sql/item_func.cc#L3469

gosharz avatar Nov 01 '22 15:11 gosharz

From https://prestodb.io/docs/current/language/types.html#decimal:

Fixed-Precision[#](https://prestodb.io/docs/current/language/types.html#fixed-precision)
DECIMAL[#](https://prestodb.io/docs/current/language/types.html#decimal)
A fixed precision decimal number. Precision up to 38 digits is supported but performance is best up to 18 digits.

The decimal type takes two literal parameters:

precision - total number of digits

scale - number of digits in fractional part. Scale is optional and defaults to 0.

Example type definitions: DECIMAL(10,3), DECIMAL(20)

Example literals: DECIMAL '10.3', DECIMAL '1234567890', 1.1

Note

For compatibility reasons decimal literals without explicit type specifier (e.g. 1.2) are treated as values of the DOUBLE type by default up to version 0.198. After 0.198 they are parsed as DECIMAL.

System wide property: parse-decimal-literals-as-double

Session wide property: parse_decimal_literals_as_double

Wondering if using one of the above libraries with maximal precision to match type boundaries will guarantee full parity.

gosharz avatar Nov 03 '22 15:11 gosharz

From https://en.wikipedia.org/wiki/Floating-point_arithmetic:

Type Sign Exponent Significand field Total bits   Exponent bias Bits precision Number of decimal digits
Half (IEEE 754-2008) 1 5 10 16   15 11 ~3.3
Single 1 8 23 32   127 24 ~7.2
Double 1 11 52 64   1023 53 ~15.9
x86 extended precision 1 15 64 80   16383 64 ~19.2
Quad 1 15 112 128   16383 113 ~34.0


Type Sign Exponent Significand field Total bits Exponent bias Bits precision Number of decimal digits Half (IEEE 754-2008) 1 5 10 16 15 11 ~3.3 Single 1 8 23 32 127 24 ~7.2 Double 1 11 52 64 1023 53 ~15.9 x86 extended precision 1 15 64 80 16383 64 ~19.2 Quad 1 15 112 128 16383 113 ~34.0

gosharz avatar Nov 03 '22 15:11 gosharz

Per https://github.com/facebookincubator/velox/pull/2892

gosharz avatar Nov 29 '22 16:11 gosharz