Julian Samaroo

Results 179 issues of Julian Samaroo

This would reduce the need for using `rocprof` or the Profile stdlib to observe kernel execution ordering and latency hiding efficiency.

logging

...because it loads `libhsa-runtime64.so`, not `libhsa-runtime64.so.1`.

bug
hsa

The current implementation has multiple flaws: - Resizing operations on `Array` are not thread-safe - wait-to-mark exhibits TOCTTOU races We need a solution for this that doesn't involve taking the...

bug
multithreading

The current approach of escaping kernel inputs during kernel execution, and having finalizers directly free HSA memory allocations, is problematic when considering the potential benefits of https://github.com/JuliaLang/julia/pull/44056. We could instead...

arrays
hsa

When the GPU is under high load and spinning on `AMDGPU.device_signal_wait`, `hsa_executable_freeze` can hang as it tries to synchronize with the GPU. We should switch to using an `InterruptSignal` (exposed...

bug
hsa

This shouldn't be necessary, but let's have CI confirm that for us.

bug
arrays
hip
sync

arrays
tests
blas

Currently, a single thread serves hostcalls, but it should be possible to spawn multiple threads to service the same hostcall (especially once #50 is merged) concurrently. This would be useful...

enhancement
hostcall