mikex86
mikex86
`LLGL::CommandBufferFlags::MultiSubmit` seems to imply, that it should be possible to record a command buffer once and resubmit it multiple times to prevent CPU cycles being wasted dispatching draw calls to...
#### Description Adds Windows compatibility for training scripts so that #180 can move forward.
Usage of std::clamp requires c++17 under MSVC
# Description When automatic GPU detection fails, PTX code will be compiled for a list of common targets: https://github.com/arrayfire/arrayfire/blob/138f12e9f181b8a7bd013323137931aec0f3bd59/CMakeModules/select_compute_arch.cmake#L33 This list is modified by various CUDA version checks to account...
## Description liblapack_static.a does not exist in CUDA 12, arrayfire/src/backend/cuda/CMakeLists.txt tries to find said library and fails. Possible work around without fix is creating a symlink from liblapack_static.a to libcusolver_lapack_static.a;...
`ops_nv.py` does not correctly populate `blockDim.{x, y, z}` and `gridDim.{x, y, z}` in launched cuda kernels. PTX `ntid.{x, y, z}` and `nctaid.{x, y, z}` register accesses get compiled into SASS...