DeepFaceLab
DeepFaceLab copied to clipboard
Problem with SAEHD Windows GTX 1060
Running trainer.
Choose one of saved models, or enter a name to create a new model. [r] : rename [d] : delete
[0] : new - latest : danielhf danielhf
Model first run.
Choose one or several GPU idxs (separated by comma).
[CPU] : CPU [0] : GeForce GTX 1060 6GB
[0] Which GPU indexes to choose? : 0
[0] Autobackup every N hour ( 0..24 ?:help ) : 0 [n] Write preview history ( y/n ?:help ) : n [0] Target iteration : 0 [y] Flip faces randomly ( y/n ?:help ) : y [4] Batch_size ( ?:help ) : 4 [288] Resolution ( 64-640 ?:help ) : 288 [head] Face type ( h/mf/f/wf/head ?:help ) : head [df] AE architecture ( ?:help ) : df [384] AutoEncoder dimensions ( 32-1024 ?:help ) : 384 [92] Encoder dimensions ( 16-256 ?:help ) : 92 [72] Decoder dimensions ( 16-256 ?:help ) : 72 [22] Decoder mask dimensions ( 16-256 ?:help ) : 22 [y] Masked training ( y/n ?:help ) : y [n] Eyes priority ( y/n ?:help ) : n [n] Uniform yaw distribution of samples ( y/n ?:help ) : n [y] Place models and optimizer on GPU ( y/n ?:help ) : y [n] Use learning rate dropout ( n/y/cpu ?:help ) : n [y] Enable random warp of samples ( y/n ?:help ) : y [0.0] GAN power ( 0.0 .. 10.0 ?:help ) : 0.0 [0.0] 'True face' power. ( 0.0000 .. 1.0 ?:help ) : 0.0 [0.0] Face style power ( 0.0..100.0 ?:help ) : 0.0 [0.0] Background style power ( 0.0..100.0 ?:help ) : 0.0 [none] Color transfer for src faceset ( none/rct/lct/mkl/idt/sot ?:help ) : none [n] Enable gradient clipping ( y/n ?:help ) : n [n] Enable pretraining mode ( y/n ?:help ) : n Initializing models: 100%|###############################################################| 5/5 [00:43<00:00, 8.63s/it] Loading samples: 100%|###############################################################| 832/832 [00:22<00:00, 37.08it/s] Loading samples: 100%|##############################################################| 338/338 [00:02<00:00, 120.58it/s] ================= Model Summary ================= == == == Model name: danielhf_SAEHD == == == == Current iteration: 0 == == == ==--------------- Model Options ---------------== == == == resolution: 288 == == face_type: head == == models_opt_on_gpu: True == == archi: df == == ae_dims: 384 == == e_dims: 92 == == d_dims: 72 == == d_mask_dims: 22 == == masked_training: True == == eyes_prio: False == == uniform_yaw: False == == lr_dropout: n == == random_warp: True == == gan_power: 0.0 == == true_face_power: 0.0 == == face_style_power: 0.0 == == bg_style_power: 0.0 == == ct_mode: none == == clipgrad: False == == pretrain: False == == autobackup_hour: 0 == == write_preview_history: False == == target_iter: 0 == == random_flip: True == == batch_size: 4 == == == ==---------------- Running On -----------------== == == == Device index: 0 == == Name: GeForce GTX 1060 6GB == == VRAM: 6.00GB == == ==
Starting. Press "Enter" to stop training and save model.
Trying to do the first iteration. If an error occurs, reduce the model parameters.
You are training the model from scratch. It is strongly recommended to use a pretrained model to speed up the training and improve the quality.
Error: OOM when allocating tensor with shape[4,288,144,144] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node DepthToSpace_4 (defined at E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops_init_.py:336) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[node concat_1 (defined at E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA\_internal\DeepFaceLab\models\Model_SAEHD\Model.py:484) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Caused by op 'DepthToSpace_4', defined at: File "threading.py", line 884, in _bootstrap File "threading.py", line 916, in bootstrap_inner File "threading.py", line 864, in run File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread debug=debug, File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 337, in on_initialize gpu_pred_src_src, gpu_pred_src_srcm = self.decoder_src(gpu_src_code) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, **kwargs) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 145, in forward x = self.upscale1(x) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, **kwargs) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 59, in forward x = nn.depth_to_space(x, 2) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops_init.py", line 336, in depth_to_space return tf.depth_to_space(x, size, data_format=nn.data_format) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2703, in depth_to_space return gen_array_ops.depth_to_space(input, block_size, data_format, name=name) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 1593, in depth_to_space data_format=data_format, name=name) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op op_def=op_def) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[4,288,144,144] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node DepthToSpace_4 (defined at E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops_init_.py:336) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[node concat_1 (defined at E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA\_internal\DeepFaceLab\models\Model_SAEHD\Model.py:484) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Traceback (most recent call last): File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call return fn(*args) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[4,288,144,144] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node DepthToSpace_4}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node concat_1}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 123, in trainerThread iter, iter_time = model.train_one_iter() File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 462, in train_one_iter losses = self.onTrainOneIter() File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 636, in onTrainOneIter src_loss, dst_loss = self.src_dst_train (warped_src, target_src, target_srcm_all, warped_dst, target_dst, target_dstm_all) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 503, in src_dst_train self.target_dstm_all:target_dstm_all, File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 929, in run run_metadata_ptr) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run run_metadata) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1348, in do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[4,288,144,144] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node DepthToSpace_4 (defined at E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops_init.py:336) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[node concat_1 (defined at E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA\_internal\DeepFaceLab\models\Model_SAEHD\Model.py:484) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Caused by op 'DepthToSpace_4', defined at: File "threading.py", line 884, in _bootstrap File "threading.py", line 916, in bootstrap_inner File "threading.py", line 864, in run File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread debug=debug, File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 337, in on_initialize gpu_pred_src_src, gpu_pred_src_srcm = self.decoder_src(gpu_src_code) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, **kwargs) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 145, in forward x = self.upscale1(x) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, **kwargs) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 59, in forward x = nn.depth_to_space(x, 2) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops_init.py", line 336, in depth_to_space return tf.depth_to_space(x, size, data_format=nn.data_format) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2703, in depth_to_space return gen_array_ops.depth_to_space(input, block_size, data_format, name=name) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 1593, in depth_to_space data_format=data_format, name=name) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op op_def=op_def) File "E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[4,288,144,144] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node DepthToSpace_4 (defined at E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops_init_.py:336) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[node concat_1 (defined at E:\Downloads\DeepFaceLab\DeepFaceLab_NVIDIA\_internal\DeepFaceLab\models\Model_SAEHD\Model.py:484) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Hey danielhama! This error is an out of memory error. It occurs when you don't have enough Vram. You must reduce your model parameters. The best thing would be to reduce batch size or Resolution. In your case I would reduce the resolution. Reduce the model settings as long as error occurs.
Have a good day
reduce dims
Issue solved / already answered (or it seems like user error), please close it.