polars icon indicating copy to clipboard operation
polars copied to clipboard

Allow casting integer types to Enum

Open stinodego opened this issue 1 year ago • 3 comments

Description

Currently it only seems to work for UInt32:

import polars as pl

dtype = pl.Enum(["a", "b", "c", "d"])
s = pl.Series([0, 3, 2, 2], dtype=pl.UInt32).cast(dtype)  # works
s = pl.Series([0, 3, 2, 2], dtype=pl.UInt64).cast(dtype)  # ComputeError: cannot cast numeric types to 'Categorical'

@c-peters could we make this work for all integer types?

stinodego avatar Jan 03 '24 11:01 stinodego

Any integer up to U32 yes since we could simply cast, however not U64. Our Categorical Type uses fixed U32 for indices. There is an open issue to change this, but it would be quite some work

c-peters avatar Jan 03 '24 20:01 c-peters

Why can't we cast UInt64? If it fits in UInt32, great, if not, we can raise or set a null value (like with any other cast).

Basically, I am requesting a shortcut to s.cast(pl.UInt32).cast(enum_dtype) which already works for UInt64.

stinodego avatar Jan 03 '24 21:01 stinodego

Aah misunderstanding, I thought you meant that the indices were going to be U64. But simply casting we can

c-peters avatar Jan 04 '24 10:01 c-peters