YAKL icon indicating copy to clipboard operation
YAKL copied to clipboard

Adjusting when mutex is held to improve performance on OpenMP backend

Open Goobley opened this issue 7 months ago • 0 comments

Should close #162. I've modified when the YAKL Array takes the mutex to be only when the reference count pointer isn't null. This makes temporary unowned views much cheaper on the OpenMP backend where host code still runs. To guarantee the same behaviour as before this requires a double check for nullptr in some places. This could be relaxed by bringing in some atomics, but this was the minimal check.

I have also added warnings to YAKL_VERBOSE on doing other apparently cheap operations that grab the mutex (e.g. reshaping/collapsing/copy-constructing). In my case, a deep function was taking an array by value instead of reference, triggering the copy constructor and destroying performance. With verbose this would now output enough information to easily track that down.

@mrnorman If you would rather, I could put these verbose prints behind YAKL_ARCH_OPENMP to limit noise when running on other backends.

Goobley avatar Jul 08 '24 14:07 Goobley