Applio icon indicating copy to clipboard operation
Applio copied to clipboard

[BUG] Cuda out of memory on Linux during inference

Open shirounanashi opened this issue 1 year ago • 11 comments

Before You Report a Bug My setup is a GTX 1660 Super, Ryzen 5600G (16GB ram).

Bug Description I use Applio on both Windows 11 and Linux (Arch), but on Linux, it is giving this Cuda out of memory error, both in the last version and in the last commit.

Steps to Reproduce Outline the steps to replicate the issue:

  1. Simply make the inference with the default settings

Expected Behavior Make the inference without giving cuda out of memory

Assets image

Desktop Details:

  • Operating System: Linux (Arch Linux, Gnome)
  • Browser: Microsoft Edge

Additional Context I'm not using IAHispano's fairseq.

shirounanashi avatar Apr 26 '24 16:04 shirounanashi

It could be because a lot of things but I think is one of this:

  1. Your GPU driver is outdated: https://docs.nvidia.com/deeplearning/cudnn/latest/reference/support-matrix.html
  2. Your GPU only has 5 GB of VRAM and it's also a GTX which there's some people having issues with them

aitronz avatar Apr 26 '24 17:04 aitronz

Thank you, it really was a driver problem, but it wasn't because it was outdated, it was because it wasn't installed, both cuda and cudnn. I installed it and solved the problem

shirounanashi avatar Apr 26 '24 17:04 shirounanashi

Testing further, I discovered that it is a problem with Applio on Linux, a problem that does not happen in RVC WebUI, that is, it has nothing to do with the driver as I thought it would be when closing the issue

shirounanashi avatar May 13 '24 13:05 shirounanashi

Applio uses identical code for GPU detection and utilization in both RVC WebUI. We only chnaged the Torch version, hence I'm sharing this link https://docs.nvidia.com/deeplearning/cudnn/latest/reference/support-matrix.html for you to verify compatibility with our current setup.

aitronz avatar May 15 '24 18:05 aitronz

@aitronssesin In theory, my GPU was supposed to run smoothly. But even with the latest version of the drivers and cudnn in a clean Arch installation, it still gives me this Cuda out of memory problem, which doesn't happen with the RVC Web UI. Honestly, I don't know why this happens, since it doesn't happen on Windows on the same PC

shirounanashi avatar May 18 '24 05:05 shirounanashi

It could be an issue with Arch because in my Ubuntu server it works without any issues.

aitronz avatar May 18 '24 10:05 aitronz

It may be, but it doesn't make sense for the RVC WebUI to work without problems

shirounanashi avatar May 18 '24 14:05 shirounanashi

Yes because of the torch version maybe

aitronz avatar May 18 '24 14:05 aitronz

I tried updating torch, torchaudio and torchvision, but the Cuda out of memory problem still occurred

shirounanashi avatar May 18 '24 15:05 shirounanashi

Sorry I didn't explain me well I was saying that probably the newer torch version we are using is broken in arch but try this:

pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu121

aitronz avatar May 18 '24 15:05 aitronz

I didn't explain myself well either, I updated to the version I was using on the RVC Web UI. But I tested the version you sent and the problem still exists. I also noticed that Applio doesn't seem to release the VRAM until I close its window in the terminal

shirounanashi avatar May 18 '24 15:05 shirounanashi