Tolmachev Dmitrii
Tolmachev Dmitrii
Wow, thank you so much for this report! I will investigate the failures today. The segmentation faults should be reproducible and fixable. Regarding the timeouts, I am not sure what...
So the issue with 2901 and 3191 was a weird one - there was a round-up of a floating number (2901/(double)10 * 100) which resulted in 29011 and that 1...
I now have again access to GH200 and ran pyvkfft there with pycuda (I have some troubles installing cupy there) ``` pyvkfft-test --systematic --backend pycuda --gpu gh200 --max-nb-tests 0 --serial...
Just tried 3070 with CUDA 11.4, no issues. The GH200 was 12.4
Hello, all these tests passed on GH200 with CUDA 12.4 (I don't have access to the previous CUDA version on this machine). I will need to check the generated kernels,...
I tried 3070 (which should be similar to A40 but with fewer SMs) with CUDA 11.4, 12.3 and 12.6 with drivers 535, 545 and 560 and it passed DST2 tests...
Hello @vincefn, sorry for the extremely long reply, I am now in the process of finishing my doctorate, so all the time is spent on that and not polishing the...
Hello, as far as I know, no one has reported using the manual tempBuffer submission before you (which is why none of the vendor libraries even try doing it), but...
Ah, ok, guess I somehow missed that. Best regards, Dmitrii
Hello, I will add this change, thank you. This also made me think that I can replace all sprintf calls with a macro that selects sprintf/snprintf based on user choice....