modin
modin copied to clipboard
Zero-copy (besides deserialization) to_numpy on Ray
trafficstars
Current implementation deserializes all partitions as numpy arrays and then concatenates them into single array.
We should be able to deserialize in pre-allocated array during ray.get() relying on pickle protocol 5.
Ref: #2815 cc: @YarShev
ray.get() allows to deserialize data with zero-copy for primitive data types (if an object supports pickle protocol 5). Then, we intentionally make a copy of data so the user can mutate it so I don't think we can do anything on this matter to speed up to_numpy. I think we can close the issue. @anmyachev, @dchigarev, thoughts?