datafusion
datafusion copied to clipboard
DataFusion should support casting strings such as "4e7" to decimal
Is your feature request related to a problem or challenge?
DataFusion supports casting the string 4e7 to float but not to decimal. This is inconsistent with Postgres (and Apache Spark).
Postgres
postgres=# select cast('4e7' as float);
float8
----------
40000000
(1 row)
postgres=# select cast('4e7' as decimal(10,2));
numeric
-------------
40000000.00
Apache Spark
scala> spark.sql("select cast('4e7' as float)").show
+------------------+
|CAST(4e7 AS FLOAT)|
+------------------+
| 4.0E7|
+------------------+
scala> spark.sql("select cast('4e7' as decimal(10,2))").show
+--------------------------+
|CAST(4e7 AS DECIMAL(10,2))|
+--------------------------+
| 40000000.00|
+--------------------------+
DataFusion
DataFusion CLI v37.0.0
❯ select cast('4e7' as float);
+-------------+
| Utf8("4e7") |
+-------------+
| 40000000.0 |
+-------------+
1 row in set. Query took 0.010 seconds.
❯ select cast('4e7' as decimal(10,2));
Arrow error: Cast error: Cannot cast string '4e7' to value of Decimal128(38, 10) type
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
I suspect this issue is actually in arrow_cast - the error message seems to come from https://github.com/apache/arrow-rs/blob/ada986c7ec8f8fe4f94235c8aaeba4995392ee72/arrow-cast/src/cast.rs#L2753
I believe this will be supported in the next arrow release