GraphBLAS
GraphBLAS copied to clipboard
Make apply more memory-friendly for CUDA
- If doing an in-place apply and C is iso on input but not on output, and a non-positional operator is used , then we need to realloc C->x and set all numerical entries to the iso value. However, this pins C->x on the host which is bad for CUDA. This change defers the iso expansion to the appropriate point.
(would it be better to instead change the API for GB_apply_op to have a do_iso_expansion flag? The drawback with the current solution is that the expansion may be performed when not needed, if C is not iso on input.)