DeepFaceLab icon indicating copy to clipboard operation
DeepFaceLab copied to clipboard

I can't run seahd on the gpu! it only runs on the cpu

Open alan9455 opened this issue 3 years ago • 4 comments

Initializing models: 80%|######################################## ######4 | 4/5 [00:28<00:07, 7.18s/it] Error: OOM when allocating tensor of shape [3,3,512,2048] and type float [[node src_dst_opt/vs_inter_B/upscale1/conv1/weight_0/Initializer/Const (defined at D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py:38) ]]

Caused by op 'src_dst_opt/vs_inter_B/upscale1/conv1/weight_0/Initializer/Const', defined at: File "threading.py", line 884, in bootstrap File "threading.py", line 916, in bootstrap_inner File "threading.py", line 864, in run File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\DeepFaceLab\mainscripts\Trainer.py", line 58, in trainerThread debug=debug) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\DeepFaceLab\models\ModelBase.py", line 193, in init self.on_initialize() File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 341, in on_initialize self.src_dst_opt.initialize_variables (self.src_dst_saveable_weights, vars_on_cpu=optimizer_vars_on_cpu, lr_dropout_on_cpu=self.options['lr_dropout']=='cpu') File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 38, in initialize_variables vs = { v.name : tf.get_variable ( f'vs{v.name}'.replace(':',''), v.shape, dtype=v.dtype, initializer=tf.initializers.constant( 0.0), trainable=False) for v in trainable_weights } File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\DeepFaceLab\core\leras\optimizers\AdaBelief.py", line 38, in vs = { v.name : tf.get_variable ( f'vs_{v.name}'.replace(':','_'), v.shape, dtype=v.dtype, initializer=tf.initializers.constant( 0.0), trainable=False) for v in trainable_weights } File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1479, in get_variable aggregation=aggregation) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 1220, in get_variable aggregation=aggregation) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 547, in get_variable aggregation=aggregation) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 499, in _true_getter aggregation=aggregation) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 911, in _get_single_variable aggregation=aggregation) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 213, in call return cls._variable_v1_call(*args, **kwargs) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 176, in _variable_v1_call aggregation=aggregation) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 155, in previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2495, in default_variable_creator expected_shape=expected_shape, import_scope=import_scope) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 217, in call return super(VariableMetaclass, cls).call(*args, **kwargs) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1395, in init constraint=constraint) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variables.py", line 1503, in _init_from_args initial_value(), name="initial_value", dtype=dtype) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 883, in shape.as_list(), dtype=dtype, partition_info=partition_info) File "D:\deepfake\DeepFaceLab\DeepFaceLab_NVIDIA_up_to_RTX2080Ti_internal\python-3.6.8\lib\site-packages\tensorflow\python\ops\init_ops.py", line 230, in call self.value, dtype=dtype, shape=shape

============ Model Summary ============ == == == Model name: new_SAEHD == == == == Current iteration: 564 == == == ==---------- Model Options ----------== == == == resolution: 320 == == face_type: f == == models_opt_on_gpu: True == == archi: liae-ud == == ae_dims: 256 == == e_dims: 64 == == d_dims: 64 == == d_mask_dims: 22 == == masked_training: True == == eyes_mouth_prio: False == == uniform_yaw: False == == blur_out_mask: False == == adabelief: True == == lr_dropout: n == == random_warp: True == == random_hsv_power: 0.0 == == true_face_power: 0.0 == == face_style_power: 0.0 == == bg_style_power: 0.0 == == ct_mode: none == == clipgrad: False == == pretrain: False == == autobackup_hour: 0 == == write_preview_history: False == == target_iter: 3000000000 == == random_src_flip: False == == random_dst_flip: True == == batch_size: 4 == == gan_power: 0.0 == == gan_patch_size: 40 == == gan_dims: 16 == == == ==----------- Running On ------------== == == == Using device: CPU == == ==

alan9455 avatar Feb 03 '22 03:02 alan9455

I have same error, here is my "Post"

Pips01 avatar Feb 15 '22 18:02 Pips01

I'm not a pro, but try use the stander values during training...?

Pips01 avatar Feb 15 '22 18:02 Pips01

Or reinstall DFL.

Pips01 avatar Feb 15 '22 18:02 Pips01

I find a solution, you have to use your CPU for the first few interactions and then save and switch to GPU

Julia222222 avatar Feb 15 '22 21:02 Julia222222

Did you ever find the answer? If so, would you mind sharing it and closing this issue?

joolstorrentecalo avatar Jun 08 '23 23:06 joolstorrentecalo