Is SAEhd Broken for gtx 1080 drivers? 2 Root errors out the box. nothing has been changed.
Running SAEhd on CPU gives this error: internal compiler error, abnormal program termination.
Running SAEhd on GPU gives me this error
================== Model Summary =================== == == == Model name: new_SAEHD == == == == Current iteration: 1000000 == == == ==---------------- Model Options -----------------== == == == resolution: 288 == == face_type: wf == == models_opt_on_gpu: True == == archi: df == == ae_dims: 384 == == e_dims: 92 == == d_dims: 72 == == d_mask_dims: 22 == == masked_training: True == == lr_dropout: n == == random_warp: False == == gan_power: 0.0 == == true_face_power: 0.0 == == face_style_power: 0.0 == == bg_style_power: 0.0 == == ct_mode: none == == clipgrad: False == == pretrain: True == == autobackup_hour: 0 == == write_preview_history: False == == target_iter: 1000000000 == == random_flip: True == == batch_size: 4 == == eyes_mouth_prio: True == == uniform_yaw: True == == blur_out_mask: False == == adabelief: True == == random_hsv_power: 0.0 == == random_src_flip: False == == random_dst_flip: True == == gan_patch_size: 36 == == gan_dims: 16 == == == ==------------------ Running On ------------------== == == == Device index: 0 == == Name: NVIDIA GeForce GTX 1080 == == VRAM: 6.78GB == == ==
Starting. Target iteration: 1000000000. Press "Enter" to stop training and save model. Error: 2 root error(s) found. (0) Resource exhausted: OOM when allocating tensor with shape[4,576,72,72] and type float on /job:localhost/replica:0/task:0/device:DML:0 by allocator DmlAllocator [[{{node ArithmeticOptimizer/AddOpsRewrite_Leaf_1_add_34}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[concat_1/concat/_417]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[4,576,72,72] and type float on /job:localhost/replica:0/task:0/device:DML:0 by allocator DmlAllocator [[{{node ArithmeticOptimizer/AddOpsRewrite_Leaf_1_add_34}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations. 0 derived errors ignored. Traceback (most recent call last): File "C:\Games\dpl_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call return fn(*args) File "C:\Games\dpl_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn target_list, run_metadata) File "C:\Games\dpl_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found. (0) Resource exhausted: OOM when allocating tensor with shape[4,576,72,72] and type float on /job:localhost/replica:0/task:0/device:DML:0 by allocator DmlAllocator [[{{node ArithmeticOptimizer/AddOpsRewrite_Leaf_1_add_34}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[concat_1/concat/_417]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[4,576,72,72] and type float on /job:localhost/replica:0/task:0/device:DML:0 by allocator DmlAllocator [[{{node ArithmeticOptimizer/AddOpsRewrite_Leaf_1_add_34}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations. 0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:\Games\dpl_internal\DeepFaceLab\mainscripts\Trainer.py", line 129, in trainerThread iter, iter_time = model.train_one_iter() File "C:\Games\dpl_internal\DeepFaceLab\models\ModelBase.py", line 474, in train_one_iter losses = self.onTrainOneIter() File "C:\Games\dpl_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 774, in onTrainOneIter src_loss, dst_loss = self.src_dst_train (warped_src, target_src, target_srcm, target_srcm_em, warped_dst, target_dst, target_dstm, target_dstm_em) File "C:\Games\dpl_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 584, in src_dst_train self.target_dstm_em:target_dstm_em, File "C:\Games\dpl_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run run_metadata_ptr) File "C:\Games\dpl_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "C:\Games\dpl_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run run_metadata) File "C:\Games\dpl_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found. (0) Resource exhausted: OOM when allocating tensor with shape[4,576,72,72] and type float on /job:localhost/replica:0/task:0/device:DML:0 by allocator DmlAllocator [[{{node ArithmeticOptimizer/AddOpsRewrite_Leaf_1_add_34}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[concat_1/concat/_417]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[4,576,72,72] and type float on /job:localhost/replica:0/task:0/device:DML:0 by allocator DmlAllocator [[{{node ArithmeticOptimizer/AddOpsRewrite_Leaf_1_add_34}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations. 0 derived errors ignored.
I dont know what to do with all this. i cant find the Run options the Hints are telling me to change. so im assuming the current build is destroyed? or is it the pretrained faces? cause honestly i tried deleting the facepak and the whole program wont work without it. pls help
Changing Batch size does nothing. its already set to 4. setting it to 2 did nothing and setting it to 256 wouldnt even start the application. if i need more RAM i can get it but i already got 16gb
im assuming i should find a legacy version of DPL cause the newest version was destroyed by their creators
tried changing batch size to 2. lowered all the encoder dimensions. and now im getting this 2022-02-25 11:40:16.791686: F tensorflow/core/common_runtime/dml/dml_heap_allocator.cc:296] Check failed: ptr != nullptr Invalid pointer
guessing DPL SAEHD function is permanently broken now?
Did you ever find the answer? If so, would you mind sharing it and closing this issue?