Julian Samaroo

Results 172 issues of Julian Samaroo

This is nowhere near ready to go yet, but I wanted to get this posted since things are progressing well for AMDGPU support overall :slightly_smiling_face: TODO: - [x] Add synchronization...

After https://github.com/JuliaParallel/Dagger.jl/pull/223 gets merged, Dagger's eager API (`Dagger.@spawn`) should be suitable for use in packages. I would recommend we use it for non-lazy computations in FileTrees so that we can...

enhancement

Currently, `DaggerChain` communicates to Dagger that the wrapped model is located on a CUDA GPU, which is not necessarily true (and shouldn't be a requirement anyway). We should provide functions...

bug

This uses Distributed to parallelize the tests, in the hopes of having CI jobs which don't take >3hrs to run. Todo: - [ ] Pass output through tmpfile to remove...

tests

Moves the GPUArrays testsuite to run after we've tested our ROCm libraries (rocBLAS et. al). In the absence of a parallel test runner, and with total test time being over...

arrays
tests

Some libraries, like rocSPARSE, call HIP functions which expect to be passed allocations generated from `hipMalloc` and friends. Because `hipMalloc` just ends up calling HSA allocation functions, we should be...

bug
enhancement
hip

As indicated by @torrance, the `getinfo` API is cumbersome by requiring passing in a `Ref` output container of the correct type. While this is probably OK for low-level HSA objects,...

hsa

A few things need improving: - Exception flags are currently per-module, which is wrong. We should integrate with the kernel state mechanism from GPUCompiler and make these per-kernel. - It...

enhancement

This will make it easier to avoid needing to use tools like `rocprof` to do simple timespan analysis of actions like kernel launch, allocation latency, memory transfer time, etc. Todo:...

needs tests
needs docs
logging