eqy issues

Results 23 issues of

eqy

[BE]: Update cudnn to 9.1.0.70

cuDNN has managed to upload cu11 and cu12 wheels for ~~9.0.0.312~~ 9.1.0.70, so trying this out... CC @Skylion007 @malfet cc @csarofeen @ptrblck @xwang233

module: cudnn

open source

topic: not user facing

[CUDNN] Remove defunct cuDNN V8 API build flag

The flag basically does nothing following #95722 Let's see if the quantization tests break CC @malfet @atalmanagement cc @csarofeen @ptrblck @xwang233 @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel

module: cudnn

oncall: quantization

open source

release notes: quantization

topic: not user facing

[NCCL][CUDA] Optionally avoid rethrowing CUDA Errors in NCCL Watchdog

Doesn't affect current behavior by default, for #126544 I'm not sure what the exact mechanism is here but CUDA errors appear to already be thrown in the main process, meaning...

oncall: distributed

open source

release notes: distributed (c10d)

[CUDA][Pooling] Fix 64-bit indexing in `avg_pool_2d` backward attempt 2

Somehow the original PR was missing the `CUDA_KERNEL_LOOP_TYPE` change??? Thanks @johnc-keen @Chillee for the great repro! (#129785) cc @ptrblck @msaroufim @mikaylagawarecki

module: cuda

module: 64-bit

open source

module: pooling

ciflow/trunk

topic: bug fixes

merging

[cuDNN][functorch] Bump tolerances for `nn.functional.conv2d` in `test_vmap_autograd_grad`

Newer versions of cuDNN can dispatch to a winograd kernel here on A100 which affects numerics a bit cc @csarofeen @ptrblck @xwang233 @zou3519 @Chillee @samdow @kshitij12345 @janeyx99

module: cudnn

module: convolution

open source

ciflow/trunk

topic: not user facing

module: functorch

merging

Restore `allowed_info` in OOM message when applicable

Seems to be removed following #99699?

triaged

open source

ciflow/trunk

topic: not user facing

merging

[CUDA] Fix more `DeviceIndex` printing

Same `char` dtype causing device index `0` to be interpreted as a null-terminator, see also #123984 cc @ptrblck @msaroufim

module: cuda

open source

ciflow/trunk

topic: not user facing

merging

eqy

[BE]: Update cudnn to 9.1.0.70

[CUDNN] Remove defunct cuDNN V8 API build flag

[NCCL][CUDA] Optionally avoid rethrowing CUDA Errors in NCCL Watchdog

[CUDA][Pooling] Fix 64-bit indexing in `avg_pool_2d` backward attempt 2

[cuDNN][functorch] Bump tolerances for `nn.functional.conv2d` in `test_vmap_autograd_grad`

Restore `allowed_info` in OOM message when applicable

[CUDA] Fix more `DeviceIndex` printing

Add `infinity` to `cutlass::platform::numeric_limits<half_t>`

Warmup on uneven last-batch-size in `validate.py`

Don't call `getenv` in side threads