datafusion
datafusion copied to clipboard
Incorrect behavior of arithmetic operations between time values
Describe the bug
Consider the following query:
DataFusion CLI v41.0.0
> select ('2024-08-27T08:21:27Z'::timestamp + interval '1 day' - interval '1 second' - interval '1 day');
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Utf8("2024-08-27T08:21:27Z") + IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 1, nanoseconds: 0 }") - IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 1000000000 }") - IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 1, nanoseconds: 0 }") |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2024-08-29T08:21:26 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.005 seconds.
While the expected result should be 2024-08-27 08:21:26.000000
.
When I add parentheses to the query, the results are what I expect:
> select ('2024-08-27T08:21:27Z'::timestamp + interval '1 day' - interval '1 second') - interval '1 day';
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Utf8("2024-08-27T08:21:27Z") + IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 1, nanoseconds: 0 }") - IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 1000000000 }") - IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 1, nanoseconds: 0 }") |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2024-08-27T08:21:26 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.002 seconds.
The issue seems to happening when you have multiple (more than 2?) operands in arithmetic. e.g.:
> select ('2024-08-27T08:21:27Z'::timestamp + interval '1 day' - interval '1 day' - interval '2 day');
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Utf8("2024-08-27T08:21:27Z") + IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 1, nanoseconds: 0 }") - IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 1, nanoseconds: 0 }") - IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 2, nanoseconds: 0 }") |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2024-08-29T08:21:27 -- Should be 2024-08-25 08:21:27.000000 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.001 seconds.
When you have two operands, the results seem to match what you'd expect:
> select ('2024-08-27T08:21:27Z'::timestamp + interval '1 day' - interval '2 day');
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Utf8("2024-08-27T08:21:27Z") + IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 1, nanoseconds: 0 }") - IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 2, nanoseconds: 0 }") |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2024-08-26T08:21:27 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.002 seconds.
To Reproduce
You can use the same queries mentioned above.
Expected behavior
Mentioned above as well
Additional context
No response