Giannis Gonidelis

Results 38 comments of Giannis Gonidelis

@bernhardmgruber > I assume we want have NVTX ranges you mean "we want to have", right? and not "we won't have".

@bernhardmgruber perf the fixing PR is doing just that by disabling only `thrust::seq` and its derivatives

Is this PR intended to check all the boxes? If not (hopefully) you can just leave the algorithm(s) that is intended to work on and we can add a link...

@jrhemstad Not sure if this is becoming obsolete now that we have settled to an environment API.

@bernhardmgruber @miscco now that I am thinking. Is a test expected for these facilities? And should it be under libcu++ better of maybe? Could be in `libcudacxx/include/cuda/__nvtx/nvtx.h` 👀

> Should this PR be also removing the NVTX headers from CUB? correct. cub tests will fail if i do until all cub algorithms convert on using the libcu++ header....

For the record the code above above produces 16bits/4 zeros (not 12bits, 3 zeros) at the end because the 48bit engine (`ranlux48`). Even if the bug wasn't there the 4th...