ginkgo
ginkgo copied to clipboard
Using copy-on-write for Ginkgo arrays
A possible optimization that would help optimize Ginkgo's copies and memory usage in general, would be to implement copy on write for gko::Array
.
Basically, whenever a copy of an array (within the same memory space) is requested, the array is not actually copied, but a pointer to the same underlying data is created. This creates implicit sharing as long as the data is accessed read-only (using gko::Array::get_const_data()
), but as soon as there is a request to access the data in a modifiable way, the implementation checks if there is more than one pointer that has to be accessed, and if that is the case, creates a private copy of the data for that object before returning it to the user.
We could even go one step further and allow doing this for copies between memory spaces. The same as before, instead of copying the data when the copy is requested, we only create a pointer to it (to a different memory space). The difference now is that this pointer is o.k. as long as the data is not accessed at all (neither read only nor write mode), but as soon as we access it, a proper copy is created.
One use case for the second version would be if we only want to perform one operations that does not require the entire object on a different executor. For example, we want to compute something from the pattern of the matrix on the CPU, while the matrix is on the GPU. In this case, we can still create a CPU copy (which will not actually copy anything), and once the kernel is called, it will request the date it needs. These will only be the pattern (which will then get copied back), but not the values of the matrix, so we never actually do a copy of the value array, which results in some savings.
We may also want to give user the possibility to force a copy, as sometimes it may impact performance if the copies are not done exactly when the users expects them.