Align aggregate return type with Snowflake
statement ok
CREATE OR REPLACE TABLE sum_example(k INT, d DECIMAL(10,5),
s1 VARCHAR(10), s2 VARCHAR(10));
statement ok
INSERT INTO sum_example VALUES
(1, 1.1, '1.1','one'),
(1, 10, '10','ten'),
(2, 2.2, '2.2','two'),
(2, null, null,'null'),
(3, null, null, 'null'),
(null, 9, '9.9','nine');
statement ok
SELECT * FROM sum_example;
query TT
SELECT SUM(d), SUM(CAST(s1 AS DECIMAL(10,5))) FROM sum_example
----
22.30000 23.20000
returns incorrectly formatted numbers:
================================================================================
Expected result:
22.30000 23.20000
================================================================================
Actual result:
22.300000000000000710542735760100185871124267578125 23.199999999999999289457264239899814128875732421875
int is a syntax sugar for number(38, 0) in snowflake and Embucket seems to understand int as DataFusion's int and this is not a correct behavior and same problem with decimal and etc. all numeric data types.
we need a serious data types overhaul #741
Upon investigating the issue more closely it seems @camuel was right this is data type related.
The corollary is we need more tests for logical expressions and function type coercion.
This issue might need an addition of an AnalyzerRule to pin desired behaviour OR figure out it is not possible and we need more deep datafusion change. Logical types perhaps?