Arrow.jl icon indicating copy to clipboard operation
Arrow.jl copied to clipboard

Struct

Open sglyon opened this issue 7 years ago • 6 comments

I would like to take a shot at this one if you don’t mind. Just opening an issue to declare my intent to start on this hopefully in the next few days.

sglyon avatar Feb 23 '18 05:02 sglyon

Great, thanks!

I don't know what you have planned, and I haven't really given this any thought yet, but one thing I'd like to recommend is that you consider that named tuples might be useful. These will be a core part of the language starting with 0.7. I don't actually know for sure whether they are relevant here, but it's a thought.

ExpandingMan avatar Feb 23 '18 14:02 ExpandingMan

I was thinking more along the lines of any Julia struct.

Though I haven’t fully considered how that would work exactly.

sglyon avatar Feb 23 '18 16:02 sglyon

Definitely yes, we'd ultimately want the ability to store any Julia struct.

One thing to keep in mind though: to get good performance you are going to have to make sure that you are not doing anything that triggers "extra" compilation. By this I mean calling a function with values that can only be determined at run-time that calls a macro that defines a Julia struct or something like that. One place where this might be unavoidable, and a really cool feature, is having it automatically define a Julia struct for you when you deserialize an Arrow struct. But in general, things like this should be avoided.

ExpandingMan avatar Feb 23 '18 16:02 ExpandingMan

Hi - how much work do you think needs to be done in order to read arrow structs? e.g. a typical "dataframe" struct with each entry's values being a vector of the same length.

kcajf avatar Dec 20 '18 16:12 kcajf

@kcajf sorry I had missed your comment all that time ago. It's a little hard for me to guess how much effort that would be right now. I'm probably going to look into IPC first, and I think maybe implementin structs will be a necessary part of that.

ExpandingMan avatar Jan 31 '19 21:01 ExpandingMan

No problem. Reading the streaming IPC is actually what I was trying to do above - I've since read up an arrow enough to know the right terminology!

An aside: there is an issue on this project discussing plans for merging into the official apache project. Has there been any further discussion on other channels about whether / how that would happen?

kcajf avatar Jan 31 '19 22:01 kcajf