polars icon indicating copy to clipboard operation
polars copied to clipboard

Float32 changes to Float64 implicitly

Open Ilykuleshov opened this issue 1 year ago • 1 comments

Checks

  • [X] I have checked that this issue has not already been reported.
  • [X] I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

df = pl.DataFrame({"foo": [0.4, -0.1, 0.9, -0.8]}).cast({"foo": pl.Float32})
df = df.with_columns(bar=pl.col("foo").abs() * pl.col("foo").sign())

assert df["bar"].dtype == pl.Float64

Log output

No response

Issue description

Operations between float32 output float64.

Expected behavior

Operations with the same data type are supposed to keep it, as is the case with numpy, torch, pandas etc.

Installed versions

--------Version info---------
Polars:               1.5.0
Index type:           UInt32
Platform:             Linux-5.15.0-94-generic-x86_64-with-glibc2.35
Python:               3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          3.0.0
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            <not installed>
fsspec:               2024.2.0
gevent:               <not installed>
great_tables:         <not installed>
hvplot:               <not installed>
matplotlib:           3.8.3
nest_asyncio:         1.6.0
numpy:                1.26.4
openpyxl:             <not installed>
pandas:               2.2.1
pyarrow:              15.0.0
pydantic:             <not installed>
pyiceberg:            <not installed>
sqlalchemy:           <not installed>
torch:                2.2.1+cu121
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>```

</details>

Ilykuleshov avatar Aug 28 '24 18:08 Ilykuleshov

The problem here is the sign function, which currently always returns 64-bit signed integers. Then once the 64-bit signed integer is combined with the Float32, the Float32 is upcast to Float64 (which makes sense).

We should probably change pl.Expr.sign to always maintain the input type instead of returning 64-bit integers. As a workaround for now you can do pl.col("foo").sign().cast(pl.Float32).

orlp avatar Aug 28 '24 19:08 orlp