ptrblck
ptrblck
I like the general idea of a tutorial, which explain some advanced use cases with custom collate functions, or some samplers. I'm not sure, if it would fit into a...
`torch.randint` expects arguments as `low, high, size, out, ...`. Since I wanted to sample random integers in `[0, nb_classes-1]` in the shape `[1, 96, 96]`, I had to pass it...
The error is raised in your `YOLO` model in: ```python x = self.detect(x) ... return self.act(self.bn(self.conv(x))) ``` which gets an input activation with `128` channels while `256` are expected. Could...
@atalman The failing tests are: #### conda-related * https://github.com/pytorch/pytorch/actions/runs/4009416336/jobs/6887436019 ``` CondaError: Downloaded bytes did not match Content-Length url: https://conda.anaconda.org/nvidia/win-64/libcublas-dev-11.9.2.110-0.tar.bz2 ## Package Plan ## target_path: C:\actions-runner\_work\pytorch\pytorch\builder\windows\conda\pkgs\libcublas-dev-11.9.2.110-0.tar.bz2 Content-Length: 311278235 downloaded bytes: 172816824...
I cannot reproduce the issue using the current nightly binaries: ``` Versions of relevant libraries: [pip3] numpy==1.22.2 [pip3] pytorch-quantization==2.1.2 [pip3] pytorch-triton==2.0.0+0d7e753227 [pip3] torch==2.0.0.dev20230124+cu116 [pip3] torch-tensorrt==1.3.0a0 [pip3] torchvision==0.15.0.dev20230124+cu116 [conda] Could not...
@sachinkadyan7 Additionally, are you observing the failure on the same rank/GPU? I don't see any device information in the stacktrace, but maybe the slurm log could give more information about...
Assuming you can reproduce the issue, you could try to update to the latest nightly binary (tomorrow) or build from source since [this fix](https://github.com/pytorch/pytorch/pull/92227) just landed. I'm unsure if it...
The stacktrace is correct in the sense that it raises the illegal memory access, but it cannot show you which kernel exactly runs into it and also does not contain...
This is a known issue and also reported [here](https://discuss.pytorch.org/t/could-not-load-library-libcudnn-cnn-train-so-8-in-new-version/190818). @malfet narrowed it down already with us to a `dlopen` call inside `libcudnn` preferring the system-wide libs over the ones coming...
Issue is also observed [here](https://discuss.pytorch.org/t/help-for-http-error-403-rate-limit-exceeded/125907) and I'm also able to reproduce it using Ubuntu20.04.