DeepFaceLab icon indicating copy to clipboard operation
DeepFaceLab copied to clipboard

Using too much processor when it's supposed to just train on the 3090

Open alexpatcas opened this issue 1 year ago • 3 comments

THIS IS NOT TECH SUPPORT FOR NEWBIE FAKERS POST ONLY ISSUES RELATED TO BUGS OR CODE

Expected behavior

Describe, in some detail, what you are trying to do and what the output is that you expect from the program.

Actual behavior

Describe, in some detail, what the program does instead. Be sure to include any error message or screenshots.

Steps to reproduce

Running windows 10 with 5950x and rtx 3090. Using 3000 series Describe, in some detail, the steps you tried that resulted in the behavior described above. My CPU is currently in all core boost whenever I use the SAEHD training ================== Model Summary =================== == == == Model name: new_SAEHD == == == == Current iteration: 8529 == == == ==---------------- Model Options -----------------== == == == resolution: 256 == == face_type: head == == models_opt_on_gpu: True == == archi: liae-udt == == ae_dims: 256 == == e_dims: 64 == == d_dims: 64 == == d_mask_dims: 32 == == masked_training: True == == eyes_mouth_prio: True == == uniform_yaw: False == == blur_out_mask: False == == adabelief: True == == lr_dropout: n == == random_warp: True == == random_hsv_power: 0.0 == == true_face_power: 0.0 == == face_style_power: 0.0 == == bg_style_power: 0.0 == == ct_mode: none == == clipgrad: False == == pretrain: False == == autobackup_hour: 0 == == write_preview_history: False == == target_iter: 0 == == random_src_flip: True == == random_dst_flip: True == == batch_size: 16 == == gan_power: 0.0 == == gan_patch_size: 32 == == gan_dims: 16 == == == ==------------------ Running On ------------------== == == == Device index: 0 == == Name: NVIDIA GeForce RTX 3090 == == VRAM: 21.15GB == == ==

Other relevant information

  • Command lined used (if not specified in steps to reproduce): main.py ...
  • Operating system and version: Windows, macOS, Linux
  • Python version: 3.5, 3.6.4, ... (if you are not using prebuilt windows binary)

alexpatcas avatar May 05 '23 18:05 alexpatcas

Here is what I thought. When you use large amounts of dataset or set a high value of batch_size , your cpu will have many things to process at first such as converting data and loading models to gpu. Since higher batch_size will make training faster on gpu , your gpu will finish one workcycle and wait for more data. This will increase the usage of cpu easily and it seems is exactly what have bother you. So reduce the number of facesets or decrease batch_size. If this work , I'm going to be very happy. (Please excuse me for my broken English.)

EricLeeaaaaa avatar May 09 '23 09:05 EricLeeaaaaa

     set pagefile 

96 .....................................................low set biggg..... 98+ 12+ faceswap 256 saehd 60-120g+ 224 Batch size98 amp 200-300g+ 512 Batch size12

cudatoolkit-11.2.2 tensorflow-gpu>=2.7.0,<2.11.0 set 2.10.1 use pip cuDNN 8.1 why not cuda12 cudnn8.9,bec DeepFaceLab_NVIDIA_RTX3000, but,,example u have 4090 u can update cudnn 8.9.0.131 use copy file to cuda 11.2,and u can both have cuda12 for game and cuda 11 for software Python dnn cuda tensorflow-2.6.0 | 3.6-3.9 | GCC 7.3.1 | Bazel 3.7.2 | 8.1 | 11.2 tensorflow-2.5.0 | 3.6-3.9 | GCC 7.3.1 | Bazel 3.7.2 | 8.1 | 11.2 tensorflow-2.4.0 | 3.6-3.8 | GCC 7.3.1 | Bazel 3.1.0 | 8.0 | 11.0 tensorflow-2.3.0 | 3.5-3.8 | GCC 7.3.1 | Bazel 3.1.0 | 7.6 | 10.1 tensorflow-2.2.0 | 3.5-3.8 | GCC 7.3.1 | Bazel 2.0.0 | 7.6 | 10.1 tensorflow-2.1.0 | 2.7、3.5-3.7 | GCC 7.3.1 | Bazel 0.27.1 | 7.6 | 10.1 tensorflow-2.0.0 | 2.7、3.3-3.7 | GCC 7.3.1 | Bazel 0.26.1 | 7.4 | 10.0 tensorflow_gpu-1.15.0 | 2.7、3.3-3.7 | GCC 7.3.1 | Bazel 0.26.1 | 7.4 | 10.0 tensorflow_gpu-1.14.0 | 2.7、3.3-3.7 | GCC 4.8 | Bazel 0.24.1 | 7.4 | 10.0 tensorflow_gpu-1.13.1 | 2.7、3.3-3.7 | GCC 4.8 | Bazel 0.19.2 | 7.4 | 10.0 tensorflow_gpu-1.12.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.15.0 | 7 | 9 tensorflow_gpu-1.11.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.15.0 | 7 | 9 tensorflow_gpu-1.10.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.15.0 | 7 | 9 tensorflow_gpu-1.9.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.11.0 | 7 | 9 tensorflow_gpu-1.8.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.10.0 | 7 | 9 tensorflow_gpu-1.7.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.9.0 | 7 | 9 tensorflow_gpu-1.6.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.9.0 | 7 | 9 tensorflow_gpu-1.5.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.8.0 | 7 | 9 tensorflow_gpu-1.4.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.5.4 | 6 | 8 tensorflow_gpu-1.3.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.4.5 | 6 | 8 tensorflow_gpu-1.2.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.4.5 | 5.1 | 8 tensorflow_gpu-1.1.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.4.2 | 5.1 | 8 tensorflow_gpu-1.0.0 | 2.7、3.3-3.6 | GCC 4.8 | Bazel 0.4.2 | 5.1 | 8

https://www.tensorflow.org/install/source_windows#gpu https://developer.nvidia.com/cuda-toolkit-archive https://developer.nvidia.com/rdp/cudnn-download

Point just update cuda cudnn not select game drive

i9900k avatar May 22 '23 01:05 i9900k

I have the same issue with this very low values

==            resolution: 128                     ==
==             face_type: wf                      ==
==     models_opt_on_gpu: True                    ==
==                 archi: liae-ud                 ==
==               ae_dims: 256                     ==
==                e_dims: 64                      ==
==                d_dims: 64                      ==
==           d_mask_dims: 22                      ==
==       masked_training: True                    ==
==       eyes_mouth_prio: True                    ==
==           uniform_yaw: False                   ==
==         blur_out_mask: False                   ==
==             adabelief: True                    ==
==            lr_dropout: n                       ==
==           random_warp: True                    ==
==      random_hsv_power: 0.0                     ==
==       true_face_power: 0.0                     ==
==      face_style_power: 0.0                     ==
==        bg_style_power: 0.0                     ==
==               ct_mode: sot                     ==
==              clipgrad: False                   ==
==              pretrain: False                   ==
==       autobackup_hour: 0                       ==
== write_preview_history: True                    ==
==           target_iter: 60000                   ==
==       random_src_flip: False                   ==
==       random_dst_flip: True                    ==
==            batch_size: 8                       ==
==             gan_power: 0.0                     ==
==        gan_patch_size: 16                      ==
==              gan_dims: 16                      ==

my rig: gigabyte rtx 3060 12gb 12gb ram ddr4 i3 9100f nvme, etc etc

Ozamatheus avatar May 30 '23 23:05 Ozamatheus

================= Model Summary ================== == == == Model name: new_SAEHD == == == == Current iteration: 0 == == == ==--------------- Model Options ----------------== == == == resolution: 256 == == face_type: wf == == models_opt_on_gpu: True == == archi: liae-ud == == ae_dims: 256 == == e_dims: 64 == == d_dims: 64 == == d_mask_dims: 22 == == masked_training: True == == eyes_mouth_prio: False == == uniform_yaw: True == == blur_out_mask: False == == adabelief: True == == lr_dropout: n == == random_warp: False == == random_hsv_power: 0.0 == == true_face_power: 0.0 == == face_style_power: 0.0 == == bg_style_power: 0.0 == == ct_mode: none == == clipgrad: False == == pretrain: True == == autobackup_hour: 0 == == write_preview_history: False == == target_iter: 0 == == random_src_flip: True == == random_dst_flip: True == == batch_size: 8 == == gan_power: 0.0 == == gan_patch_size: 32 == == gan_dims: 16 == == == ==----------------- Running On -----------------== == == == Device index: 0 == == Name: AMD Radeon RX 6600 XT == == VRAM: 6.85GB == == ==

Load 50 - 52% - i9900k (4.7) 10 - 20% - RX 6600xt 8gb (7.7gb) 40 - 50% - 32 gb ram (Python pr - 3gb) 0% - Drive

[02:42:36][#001273][0995ms][1.5251][1.3462] [02:42:43][#001280][0992ms][1.4562][1.3111] [02:42:59][#001296][0994ms][1.6962][1.3788]

and8928 avatar Jun 03 '23 23:06 and8928

Issue solved / already answered (or it seems like user error), please close it.

joolstorrentecalo avatar Jun 08 '23 23:06 joolstorrentecalo