duckdb-wasm
duckdb-wasm copied to clipboard
Serialize INTERVAL to Apache-arrow
What happens?
Hi @Mytherin
When DuckDB serializes an INTERVAL to Arrow it converts it to a fixed-ms duration. This is problematic in that a month is not always 30 days (and the serialization assumes a month is 30 days), and so serializing an interval that contains a MONTH or YEAR date-part is usually inaccurate. The relevant code is in the following places:
-
https://github.com/duckdb/duckdb/blob/master/src/common/arrow/arrow_converter.cpp#L79
-
https://github.com/duckdb/duckdb/blob/master/src/common/arrow/arrow_appender.cpp#L130
I would propose that we keep the information that is currently in the INTERVAL struct when serializing to arrow, that is, {months, days, nanos}. If you agree to this approach, I would like to implement this fix immediately.
Thank you.
To Reproduce
run query: SELECT INTERVAL 1 YEAR;
in duckdb-wasm.
OS:
MacOS
DuckDB Version:
Any
DuckDB Client:
duckdb-wasm
Full Name:
Tuyen Nguyen
Affiliation:
maintainer of the duckdb/geo extension
Have you tried this on the latest master
branch?
- [X] I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- [X] I agree
Thanks for the report!
Yes, that makes sense, as it seems that Arrow has added support for a month,day,ns interval. The reason we used the other interval types was because this type did not exist back when we added the Arrow integration.
Hi @handstuyennn!
I believe the situation changed from when you opened the query, now Arrow intervals are used, but this gives back a non-sensical result.
I will take a look at this, seems the same problem behind #1203.
Thanks for reporting, this is solved thanks to https://github.com/duckdb/duckdb-wasm/pull/1769