polars icon indicating copy to clipboard operation
polars copied to clipboard

.transpose() to object dtype

Open Noghpu opened this issue 9 months ago • 1 comments

Description

Right now df.transpose() infers the resulting dtype, which for mixed types usually ends up being strings or lists of strings. Would it be possible to add a parameter to .transpose() to allow for object dtypes, where the rows are not casted at all? Just as an example with something like as_object:

df = pl.DataFrame(
    dict(
        int=[1],
        string=["one"],
        bool=[True],
        list=[[1]],
    ),
)

df_transposed = df.transpose(
    include_header=True,
    header_name="type",
    column_names=["value"],
)

df_object = df.transpose(
    include_header=True,
    header_name="type",
    column_names=["value"],
    as_object=True,
)

df = shape: (1, 4)

int string bool list
i64 str bool list[i64]
----- -------- ------ -----------
1 one true [1]

df_transposed = shape: (4, 2)

column column_0
str list[str]
-------- -----------
int ["1"]
string ["one"]
bool ["true"]
list ["1"]

df_object= shape: (4, 2)

type value
str object
-------- ----------
int 1
string one
bool True
list [1]

As for the use case, personally, I was asked to create a table of a transposed df using great-tables. The automatic casting to strings of mixed columns means that I can't make use of great-tables built in number formatting methods. So instead I need to either recast the rows back to their original dtype or format the df using polars before transposing.

Noghpu avatar May 15 '24 12:05 Noghpu

Can I add another use case? Many financial reports have the dates flow horizontally across the worksheet. All manipulation etc can be done in polars but final report being pushed to Excel. So please add my vote for this enhancement.

geoffwright240 avatar May 19 '24 16:05 geoffwright240