dataframe icon indicating copy to clipboard operation
dataframe copied to clipboard

Add primitive arrays column wrappers

Open altavir opened this issue 4 years ago • 4 comments
trafficstars

Primitive array columns are required for optimized big-data applications. It is also possible to add numerical DataFrame integration with MultiK or KMath.

altavir avatar Jun 28 '21 08:06 altavir

Our current idea is to use Arrow as a backend for primitive types. See https://github.com/Kotlin/dataframe/issues/78

nikitinas avatar Dec 22 '21 22:12 nikitinas

It is a great idea, but it will be worth it only in terms of interop with other platforms. For JVM-only, Arrow will give nothing new.

altavir avatar Dec 23 '21 06:12 altavir

Arrow should give significant performance increase for JVM due to nullable values types support. Current implementation generates quite a lot of boxing/unboxing. It can be solved without Arrow, but I expect Arrow implementation to be faster. We will do performance benchmarks before implementation.

And we need to support Arrow I/O anyway.

nikitinas avatar Dec 23 '21 18:12 nikitinas

I was experimenting with asList() wrappers. Maybe this could solve this long-standing issue: https://github.com/Kotlin/dataframe/compare/master...primitive-array-value-columns but it needs more testing.

Jolanrensen avatar May 17 '24 16:05 Jolanrensen