DeepFaceLab
DeepFaceLab copied to clipboard
RTX 4090 on Deepfacelab
What is the best configuration for 4090 in DeepFaceLab settings? Currently, I can only use up to Resolution: 192 and Batch size: 8, and when I use some pre-trained models (Resolution 380, 320), I encounter errors. My computer only has 32GB of RAM. Could this be due to insufficient RAM, and should I upgrade to 64GB of RAM?
Thank You
I ran it on DeepFaceLab_NVIDIA_RTX3000_series using default settings and it worked (I also have 4090 + 32gb RAM)
True for quick96 training. False for saehd training. Actually I only can run quick96. All the other training model are not working in my pc. gpu: asusrog strix rtx4090, cpu:intel13900kf
4090 is more than capable of handling 512 res at 8+ batch size. Increase the pagefile of the drive DFL is installed on to 64-128gb. The higher the better.
set pagefile
96 .....................................................low set biggg..... 98+ 12+ faceswap 256 saehd 60-120g+ 224 Batch size98 dx12 size 105 amp 200-300g+ 512 Batch size12
cudatoolkit-11.2.2 tensorflow-gpu>=2.7.0,<2.11.0 set 2.10.1 use pip cuDNN 8.1 why not cuda12 cudnn8.9,bec DeepFaceLab_NVIDIA_RTX3000, but,,example u have 4090 u can update cudnn 8.9.0.131 use copy file to cuda 11.2(update deepfacelab,its Point!Now RTX40s u can update cuda 12,and delete the dfl files cuda cudnn in DeepFaceLab_internal,let new cuda12 handle it,dont forget backup),and u can both have cuda12 for game and cuda 11 for software Python dnn cuda tensorflow-2.6.0 | 3.6-3.9 | GCC 7.3.1 | Bazel 3.7.2 | 8.1 | 11.2 tensorflow-2.5.0 | 3.6-3.9 | GCC 7.3.1 | Bazel 3.7.2 | 8.1 | 11.2 tensorflow-2.4.0 | 3.6-3.8 | GCC 7.3.1 | Bazel 3.1.0 | 8.0 | 11.0 tensorflow-2.3.0 | 3.5-3.8 | GCC 7.3.1 | Bazel 3.1.0 | 7.6 | 10.1 tensorflow-2.2.0 | 3.5-3.8 | GCC 7.3.1 | Bazel 2.0.0 | 7.6 | 10.1 tensorflow-2.1.0 | 2.7、3.5-3.7 | GCC 7.3.1 | Bazel 0.27.1 | 7.6 | 10.1 tensorflow-2.0.0 | 2.7、3.3-3.7 | GCC 7.3.1 | Bazel 0.26.1 | 7.4 | 10.0 tensorflow_gpu-1.15.0 | 2.7、3.3-3.7 | GCC 7.3.1 | Bazel 0.26.1 | 7.4 | 10.0 tensorflow_gpu-1.14.0 | 2.7、3.3-3.7 | GCC 4.8 | Bazel 0.24.1 | 7.4 | 10.0 tensorflow_gpu-1.13.1 | 2.7、3.3-3.7 | GCC 4.8 | Bazel 0.19.2 | 7.4 | 10.0 tensorflow_gpu-1.12.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.15.0 | 7 | 9 tensorflow_gpu-1.11.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.15.0 | 7 | 9 tensorflow_gpu-1.10.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.15.0 | 7 | 9 tensorflow_gpu-1.9.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.11.0 | 7 | 9 tensorflow_gpu-1.8.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.10.0 | 7 | 9 tensorflow_gpu-1.7.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.9.0 | 7 | 9 tensorflow_gpu-1.6.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.9.0 | 7 | 9 tensorflow_gpu-1.5.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.8.0 | 7 | 9 tensorflow_gpu-1.4.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.5.4 | 6 | 8 tensorflow_gpu-1.3.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.4.5 | 6 | 8 tensorflow_gpu-1.2.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.4.5 | 5.1 | 8 tensorflow_gpu-1.1.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.4.2 | 5.1 | 8 tensorflow_gpu-1.0.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.4.2 | 5.1 | 8
https://www.tensorflow.org/install/source_windows#gpu https://developer.nvidia.com/cuda-toolkit-archive https://developer.nvidia.com/rdp/cudnn-download https://developer.nvidia.com/cuda-gpus Point just update cuda cudnn not select game drive
@i9900k @PronPan Big thanks to your response and help. Indeed, after adding the page file size ( ranging from 65GB to 120GB) on the same drive as DeepFaceLab installed, I finally can run SAEHD training model. I aslo upgrade the CUDNN to 8.9 but it requires a new DLL file name is zlibwapi.dll. I copy it from Nvidia site. Cuda, I use 11.2 default version in DeepfaceLab. Tensorflow, I use default version package in DeepFaceLab as well. I think the most critical factor is the page file size entension to 120GB.
Did you ever find the answer? If so, would you mind sharing it and closing this issue?
Did you ever find the answer? If so, would you mind sharing it and closing this issue?
I like to. But I am not the person raising this issue. Am I allowed to close it? If yes, how to close it? And what those steps? Thanks.
Thanks for the comments, everyone! Just wanted to let you know that I resolved my issue by increasing the virtual memory by an additional 300GB.
I'd like to add that activating page file/virtual memory worked for me too, although according to Windows' task manager, the RAM/CPU/GPU were all far from their max while running. It only gave 'out of memory' errors on the 6) train SAEHD.bat process though, but not the 5.XSeg) train.bat one (haven't tried the other trainers). I'm on Windows 11 with a Geforce 3060 RTX.
Hi! Dear @FghostHKG, just a follow up question from someone suffering to get Tensorflow to use my GPU. I have the same graphics card: rtx 4090. Can you please tell me which Cuda, Cudnn, and Tf version are you using to make it work? Thank you!