Russell Mull
Russell Mull
~I'm not 100% confident in this diagnosis, fwiw. What I know for sure is that I'm burning time in `MutexClient::register`; I can't easily see beyond that because of inlining.~ Scratch...
I'm viewing this as more of a bug, fwiw. I need to use burn from multiple threads, but the global mutex means I can't. This is pretty much a deal...
I tried to apply some lock hygiene just in the place it jumped out to me (https://github.com/mullr/burn/tree/narrower-autodiff-lock). This helped a little; I gained about 20% when running on 4 cores....
The `async` feature does reduce CPU usage. The contention is still present though, so performance is bad. The problem seems to be that there is a single `AutodiffServer` which is...