error in inference process (no valid convolution algorithms available in CuDNN)
One error appear during the inference process, as follows:
-- Process 7 terminated with the following error: Traceback (most recent call last): File "/root/miniconda3/envs/vgen/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, *args) File "/mmu-ocr/weijiawu/MovieDiffusion/i2vgen-xl/tools/inferences/inference_i2vgen_entrance.py", line 171, in worker y_visual, y_text, y_words = clip_encoder(image=image_tensor, text=captions) File "/root/miniconda3/envs/vgen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/mmu-ocr/weijiawu/MovieDiffusion/i2vgen-xl/tools/modules/clip_embedder.py", line 185, in forward xi = self.model.encode_image(image.to(self.device)) if image is not None else None File "/root/miniconda3/envs/vgen/lib/python3.8/site-packages/open_clip/model.py", line 547, in encode_image return self.visual(image) File "/root/miniconda3/envs/vgen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/root/miniconda3/envs/vgen/lib/python3.8/site-packages/open_clip/model.py", line 394, in forward x = self.conv1(x) # shape = [*, width, grid, grid] File "/root/miniconda3/envs/vgen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "/root/miniconda3/envs/vgen/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "/root/miniconda3/envs/vgen/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: no valid convolution algorithms available in CuDNN
Does anyone know how to fix it? thks~
The previous issue was due to insufficient memory on the GPU card. What GPU card are you using?
8 32GB V100s. Is there any solution available? thanks~
I have only verified on A100. V100 is yet to be verified.