delta-rs
delta-rs copied to clipboard
`write_deltalake` throws parser error when using `rust` engine and big decimals
Environment
Delta-rs version: 0.17.4
Binding: Python
Environment:
- Cloud provider: local
- OS: Windows
Bug
What happened:
The following error was thrown on calling write_deltalake using the rust engine with a decimal value that is larger than 16 digits:
Exception: Parser error: can't parse the string value 1.1111111111111112e16 to decimal
This error does not occur when using the pyarrow engine.
This error does not occur with decimal values that are 16 digits or less.
What you expected to happen: The table got written without error.
How to reproduce it:
from decimal import Decimal
import pyarrow as pa
import deltalake
from deltalake import write_deltalake
assert deltalake.__version__ == "0.17.4"
big_decimal = Decimal(11111111111111111) # 17 digits
data = {"decimal_column": pa.array([big_decimal])}
arrow_table = pa.table(data)
write_deltalake(tmp_path, arrow_table, engine="rust") # throws parser error
More details:
Perhaps related to #1778, #2193, #2221. I opened a new issue because this bug is rust-engine specific, while the others (seemingly) aren't.
It's a known issue, it will be resolved when we upgrade the arrow crates
@ion-elgreco when you have a chance can you link to the arrow issue? That would be handy to have lying around :smile:
@rtyler this is the issue I created in arrow-rs: https://github.com/apache/arrow-rs/issues/5549, resolved by this PR: https://github.com/apache/arrow-rs/pull/5611
It's a known issue, it will be resolved when we upgrade the arrow crates
@ion-elgreco would it be possible to update the crates with next release?
@nixent no, we are waiting on the next release of arrow-rs