MoFtZ
MoFtZ
@DorloBorlo Looks like there is a defect with ILGPU v1.5.1 that does not allow disabling the debug symbols. ILGPU is automatically setting the mode to `DebugSymbols.Basic` when a debugger is...
Closing this ticket. This is a first-chance exception that is caught internally by ILGPU. It is currently unavoidable in v1.5.1, and v2.x has a fix that will prevent ILGPU from...
API Proposal 1. Add `CreateEvent` method to `Accelerator` class, returning new `AcceleratorEvent` class. 2. Add `RecordEvent` and `WaitForEvent` methods to `AcceleratorStream` class. @m4rs-mt what do you think?
Alternate API Proposal API Proposal 1. Add `AddEvent` method to `AcceleratorStream` class, returning new `AcceleratorEvent` class. 2. Add`WaitForEvent` method to `AcceleratorStream` class. This removes the ability to re-record the event.
Alternate API Proposal 2 Instead of exposing a new object, introduce an `WaitForStream` method between two streams. Not sure how this will affect the lifetime of the native handles.
Thanks for the feedback @Ruberik. The first alternate proposal was to hide the concept of re-recording an event. Cuda supports it, but I'm not sure about OpenCL. The second alternate...
Yes, you are correct. Perhaps that proposal is too simplified. It would need to change the meaning from "wait for all Task A to finish before starting C" to "synchronise...
welcome @Juff-Ma. I'm not sure anyone has previously tried ILGPU on WSL2, however, I'm surprised that the Nvidia sample projects work and ILGPU is not able to find the GPU....
It looks like Cuda provides a [few alternate floating point options](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#alternate-floating-point-data-formats), including `bf16` and `tf32`. This would have to be a Cuda only feature, as there is no equivalent in...
Based on our last discussions, this is more broadly related to adding support for the Cuda WMMA ([Warp Level Matrix Multiply-Accumulate Instructions](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions)); adding support for the `fp8` and `bfloat16` types...