DeepFaceLab
                                
                                 DeepFaceLab copied to clipboard
                                
                                    DeepFaceLab copied to clipboard
                            
                            
                            
                        Can't get RTX 3090 to work with deepfacelab.
THIS IS NOT TECH SUPPORT FOR NEWBIE FAKERS POST ONLY ISSUES RELATED TO BUGS OR CODE
Expected behavior
I have the highest specs with an rtx 3090 and a ryzen 5800x cpu. I expect it to run, but it comes up with this error:Error: 2 root error(s) found. (0) Resource exhausted: OOM when allocating tensor with shape[5376,34,34] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node Pad_38 (defined at C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\layers\Conv2D.py:87) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
     [[concat_4/concat/_1141]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
(1) Resource exhausted: OOM when allocating tensor with shape[5376,34,34] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node Pad_38 (defined at C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\layers\Conv2D.py:87) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
0 successful operations. 0 derived errors ignored.
Errors may have originated from an input operation. Input Source operations connected to node Pad_38: LeakyRelu_28 (defined at C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py:29)
Input Source operations connected to node Pad_38: LeakyRelu_28 (defined at C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py:29)
Original stack trace for 'Pad_38': File "threading.py", line 884, in _bootstrap File "threading.py", line 916, in _bootstrap_inner File "threading.py", line 864, in run File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread debug=debug) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\models\ModelBase.py", line 193, in init self.on_initialize() File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 410, in on_initialize gpu_pred_dst_dst, gpu_pred_dst_dstm = self.decoder_dst(gpu_dst_code) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, **kwargs) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 225, in forward x = self.upscale2(x) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, **kwargs) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 71, in forward x = self.conv1(x) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in call return self.forward(*args, **kwargs) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\layers\Conv2D.py", line 87, in forward x = tf.pad (x, padding, mode='CONSTANT') File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper return target(*args, **kwargs) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\array_ops.py", line 3528, in pad result = gen_array_ops.pad(tensor, paddings, name=name) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 6487, in pad "Pad", input=input, paddings=paddings, name=name) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper attrs=attr_protos, op_def=op_def) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal op_def=op_def) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in init self._traceback = tf_stack.extract_stack_for_node(self._c_op)
Traceback (most recent call last): File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1375, in _do_call return fn(*args) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1360, in _run_fn target_list, run_metadata) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1453, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found. (0) Resource exhausted: OOM when allocating tensor with shape[5376,34,34] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node Pad_38}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
     [[concat_4/concat/_1141]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
(1) Resource exhausted: OOM when allocating tensor with shape[5376,34,34] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node Pad_38}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
0 successful operations. 0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\mainscripts\Trainer.py", line 129, in trainerThread iter, iter_time = model.train_one_iter() File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\models\ModelBase.py", line 474, in train_one_iter losses = self.onTrainOneIter() File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 774, in onTrainOneIter src_loss, dst_loss = self.src_dst_train (warped_src, target_src, target_srcm, target_srcm_em, warped_dst, target_dst, target_dstm, target_dstm_em) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 584, in src_dst_train self.target_dstm_em:target_dstm_em, File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 968, in run run_metadata_ptr) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1191, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1369, in _do_run run_metadata) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1394, in _do_call raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found. (0) Resource exhausted: OOM when allocating tensor with shape[5376,34,34] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node Pad_38 (defined at C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\layers\Conv2D.py:87) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
     [[concat_4/concat/_1141]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
(1) Resource exhausted: OOM when allocating tensor with shape[5376,34,34] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node Pad_38 (defined at C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\layers\Conv2D.py:87) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
0 successful operations. 0 derived errors ignored.
Errors may have originated from an input operation. Input Source operations connected to node Pad_38: LeakyRelu_28 (defined at C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py:29)
Input Source operations connected to node Pad_38: LeakyRelu_28 (defined at C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py:29)
Original stack trace for 'Pad_38': File "threading.py", line 884, in _bootstrap File "threading.py", line 916, in _bootstrap_inner File "threading.py", line 864, in run File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread debug=debug) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\models\ModelBase.py", line 193, in init self.on_initialize() File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 410, in on_initialize gpu_pred_dst_dst, gpu_pred_dst_dstm = self.decoder_dst(gpu_dst_code) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, **kwargs) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 225, in forward x = self.upscale2(x) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, **kwargs) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 71, in forward x = self.conv1(x) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in call return self.forward(*args, **kwargs) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\DeepFaceLab\core\leras\layers\Conv2D.py", line 87, in forward x = tf.pad (x, padding, mode='CONSTANT') File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\dispatch.py", line 206, in wrapper return target(*args, **kwargs) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\array_ops.py", line 3528, in pad result = gen_array_ops.pad(tensor, paddings, name=name) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 6487, in pad "Pad", input=input, paddings=paddings, name=name) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 750, in _apply_op_helper attrs=attr_protos, op_def=op_def) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3569, in _create_op_internal op_def=op_def) File "C:\Users\Redux\Downloads\Reface\DeepFaceLab_NVIDIA_RTX3000_series_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 2045, in init self._traceback = tf_stack.extract_stack_for_node(self._c_op)
Actual behavior
I go through all the settings in SAEHD without changing anything and it comes up with this long paragraph that I can't understand or it says bad allocation or out of memory.
Steps to reproduce
I cannot recall exactly what i changed but i don't think it would be detrimental because I then used the cpu instead of the gpu and worked fine, but I don't think im getting the quickest result with the cpu. resolution: 128 == == face_type: wf == == models_opt_on_gpu: True == == archi: df-ud == == ae_dims: 256 == == e_dims: 64 == == d_dims: 64 == == d_mask_dims: 22 == == masked_training: True == == eyes_mouth_prio: True == == uniform_yaw: True == == blur_out_mask: True == == adabelief: True == == lr_dropout: n == == random_warp: False == == random_hsv_power: 0.0 == == true_face_power: 0.0 == == face_style_power: 0.0 == == bg_style_power: 0.0 == == ct_mode: none == == clipgrad: True == == pretrain: True == == autobackup_hour: 1 == == write_preview_history: True == == target_iter: 0 == == random_src_flip: False == == random_dst_flip: False == == batch_size: 21 == == gan_power: 0.0 == == gan_patch_size: 16 == == gan_dims: 16 == == == ==------------------ Running On ------------------== == == == Device index: 0 == == Name: NVIDIA GeForce RTX 3090 == == VRAM: 21.17GB ==
Other relevant information
- Command lined used (if not specified in steps to reproduce): main.py ...
- Operating system and version: Windows, macOS, Linux
- Python version: 3.5, 3.6.4, ... (if you are not using prebuilt windows binary)
Same problem here with the 3090 (but Intel CPU).
How much RAM do you have? How big of a page file? I was having OOM issues with a 384 res df-ud model until I upped my page file to 64GB.
I have an i7-12700k, RTX 3090, and 64GB of RAM. Here's one model i've currently been using:
==---------------- Model Options -----------------==
==                                                ==
==            resolution: 384                     ==
==             face_type: f                       ==
==     models_opt_on_gpu: True                    ==
==                 archi: df-ud                   ==
==               ae_dims: 352                     ==
==                e_dims: 88                      ==
==                d_dims: 88                      ==
==           d_mask_dims: 16                      ==
==       masked_training: True                    ==
==       eyes_mouth_prio: False                   ==
==           uniform_yaw: False                   ==
==             adabelief: True                    ==
==            lr_dropout: y                       ==
==           random_warp: False                   ==
==       true_face_power: 0.0                     ==
==      face_style_power: 0.0                     ==
==        bg_style_power: 0.0                     ==
==               ct_mode: none                    ==
==              clipgrad: False                   ==
==              pretrain: False                   ==
==       autobackup_hour: 0                       ==
== write_preview_history: False                   ==
==           target_iter: 0                       ==
==       random_src_flip: False                   ==
==       random_dst_flip: True                    ==
==            batch_size: 7                       ==
==             gan_power: 0.0                     ==
==        gan_patch_size: 48                      ==
==              gan_dims: 16                      ==
==         blur_out_mask: False                   ==
==      random_hsv_power: 0.0                     ==
==                                                ==
==------------------ Running On ------------------==
==                                                ==
==          Device index: 0                       ==
==                  Name: NVIDIA GeForce RTX 3090 ==
==                  VRAM: 21.17GB                 ==
==                                                ==
====================================================
First off, no, you will not get NEAR the results with the CPU as you would with a 3090. Second. batch size 21 seems high, especially with "models_opt_on_gpu: True". Even with 24Gb VRAM you are getting OOM errors, I don't think thats a coincidence. Drop the batch size to something like 2 and change to this "models_opt_on_gpu: False" and try to reopen the model on GPU. If it opens then you found the culprit. Next, watch VRAM usage in task manager and start turning up your batch size up bit by bit, or try turning on "models_opt_on_gpu: True" while keeping your low batch size. Just turn up the model bit by bit until your VRAM usage is as high as you can reasonably go while not going over your VRAM limit. If I'm wrong..........well, we all try to help.
Did you ever find the answer? If so, would you mind sharing it and closing this issue?
No I didn’t and I ended up just using the cpu. I abandoned deep face lab as I thought I could generate a face for a video game but it didn’t work On Thu, Jun 8, 2023 at 4:18 PM JDTCC @.***> wrote:
Did you ever find the answer? If so, would you mind sharing it and closing this issue?
— Reply to this email directly, view it on GitHub https://github.com/iperov/DeepFaceLab/issues/5502#issuecomment-1583581191, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTBSLFQ5KTI47B3SJMLGHTXKJMWBANCNFSM5R3WVWTA . You are receiving this because you authored the thread.Message ID: @.***>