serde_arrow icon indicating copy to clipboard operation
serde_arrow copied to clipboard

is there a way to change the index type of dictionary encodings of strings?

Open sthornington opened this issue 9 months ago • 3 comments

uint32 specifically doesn't seem to work with pandas... (the arrow batch readers resulting cannot do read_pandas() for example).

sthornington avatar Apr 04 '25 18:04 sthornington

I'm trying an overwrite with a custom dictionary key field DataType.....

sthornington avatar Apr 04 '25 19:04 sthornington

Changing this index is currently not implemented. You could manually overwrite the corresponding fields. But I would also be happy to accept a patch that adds this option to the current code.

FYI. I chose uint32 as it is used by polars as its index type.

chmp avatar Apr 12 '25 08:04 chmp

Yes I got it working with overwrites, I'll see about a patch when I am back at my desk!

sthornington avatar Apr 12 '25 09:04 sthornington