arrow-julia icon indicating copy to clipboard operation
arrow-julia copied to clipboard

copy does not copy to standard Julia Types

Open schlichtanders opened this issue 1 year ago • 5 comments

While the documentation says that a copy would ensure to have normal Julia Types

df = copy(DataFrame(Arrow.Table(file))): Build a DataFrame, where the columns are regular in-memory vectors (specifically, Base.Vectors and/or PooledVectors). This requires that you have enough memory to load the entire DataFrame into memory.

this is not the case

image

schlichtanders avatar Jan 18 '24 10:01 schlichtanders

maybe that's because result_old[!, "workers"] is not a ::DataFrame?

Moelf avatar Jan 18 '24 13:01 Moelf

Interesting, could be, but then, why are only DataFrames supported for copy and not regular Arrays?

schlichtanders avatar Jan 24 '24 09:01 schlichtanders

Because you extracted just a column, which is not a data frame, so copy specialized for data frame doesn't work

Moelf avatar Jan 24 '24 13:01 Moelf

I think the other answer is because (if I understand correctly) it's a DataFrames.jl feature, not an Arrow.jl feature, it's just documented here because it's a common ask

ericphanson avatar Jan 24 '24 13:01 ericphanson

Maybe collect is what's desired here, for materializing a column into a Vector? Naively collect should be "iterate this collection into an Array", though I haven't tried it in this case

ericphanson avatar Jan 24 '24 13:01 ericphanson