MoFtZ

Results 88 comments of MoFtZ

hi @pjsgsy Welcome. And sorry to hear that you are having some issues. Are you able to provide your code, so that we can help you resolve the compiler issues?...

hi @afmg-aherzog The changes in v1.5.2 can be seen here: https://github.com/m4rs-mt/ILGPU/compare/v1.5.1...v1.5.2 They are all relatively minor updates. The largest change was probably the functionality to [use the more performant LibDevice...

> We make heavy use of MathF.* related functions. But I don't see exactly how that could _suddenly_ case resource/memory issues. > Cuda implements their math functions using the LibDevice...

> Does this mean our user would actually need to install NVVM and LibDevice? NVVM and LibDevice are part of the Cuda SDK, and [may be distributed with your application](https://docs.nvidia.com/cuda/eula/index.html#attachment-a)....

The kernel parameters are marshaled as a single buffer. From the v1.5.1 PTX, `.param .align 8 .b8 _p_49893[136]` says that a parameter of 136 bytes should be aligned to 8...

The difference in buffer size from 136 bytes to 132 bytes is due to padding in the parameter structure. The structure is defined as: ```CSharp public readonly record struct KernelParameters_NotWorking1(...

This should [already be fixed](https://github.com/caronc/ha-ultrasync/pull/40). But needs a new version to be published.

hi @mlemanczyk. Your project is very large. Are you able to provide a simple example that reproduces the issue? Alternatively, try running on the CPU accelerator, to see if it...

@m4rs-mt I have managed to reduce the problem, but still do not understand the issue. Changing the local memory allocation lower than 255 elements will prevent the issue from occurring....