Krzysztof Lecki
Krzysztof Lecki
> Seeming we are in an opposite time zone :) Yes, it would appear so :) > I saw your 2021 roadmap and your words, but not sure what that...
In the current state I think the AutoAugment example can look like this: ``` policy = fn.random.uniform(values=[0, 1, 2, 3, 4]) # 5 Policies if policy == 0: images =...
You are right, I thought `np.random.choice` returns elements without repetition. Thanks for the correction and the additional link! > Thinking hard, if we have one of the two (if/else or...
I wonder about feasibility of making the main implementation a device function taking the pointer to SampleDesc and adding thin layer of two `__gobal__` kernels where one just gets the...
That's great to hear. If the problem reappears, we can follow up on that. Especially if you stumble upon a repro, it would be easier to ask around for clarification...
> Progress update: > > My dream solution would be to have some magic general wrapper for kernels like `copy_and_launch` that would copy the data from host to the device...
Thank you @szkarpinski for the great repro. I will follow up on that - I can indeed see the spill you are mentioning in the PTX and SASS. Maybe we...
I'm sorry for the delay in response, but I was waiting for the release of CUDA 11.7. There is new annotation for kernel parameters: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#grid-constant and I got a suggestion...
@szkarpinski I tried the `__grid_constant__` and at least in the small repro it makes the perf of both kernels match and the produced code is quite similar. Let me know...
!build