arrow-julia
arrow-julia copied to clipboard
DictEncode a DictEncode
DictEncode signals that a column/array should be dictionary encoded when serialized to the arrow streaming/file format.
The current constructor will happily wrap a DictEncode in another DictEncode.
https://github.com/apache/arrow-julia/blob/v2.2.1/src/arraytypes/dictencoding.jl#L69
Does it make sense to add a no-op constructor DictEncode(x::DictEncode) = x?
Right now Arrow.write fails to if I run this line twice. I know it's not good practice and it's no big deal, but wanted to bring it up for discussion anyway.
df.col= Arrow.DictEncode(df.col)