Deep-Live-Cam Live works, single image works, video conversion doesn't

CUDA 11.8 is installed:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

cudnn 9.2.0 Python 3.10.10 Specs: RTX2060S - 8GB; i5 9600K; 16GB RAM; sat on SSD; Windows 10 64-bit Single image works; Live works well (most impressed); but video conversion doesn't. I;'ve set it to the default of all switches off, except audio (though I have tested it with other combinations), and it creates a temp folder, and subfolder, and extracts the png images, but I get a batch of errors starting with this:

[DLC.CORE] Creating temp resources...
[DLC.CORE] Extracting frames...
[DLC.FACE-SWAPPER] Progressing...
Processing:   0%| | 0/102 [00:00<?, ?frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=60, max_[ONNXRuntimeError] : 1 : FAIL : D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:121 onnxruntime::CudaCall D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:114 onnxruntime::CudaCall CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=DESKTOP-IH921FK ; file=D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_stream_handle.cc ; line=50 ; expr=cublasCreate(&cublas_handle_);

with several repeated similar errors and finally ending with this:

Processing:  24%|▏| 24/102 [00:03<00:09,  7.94frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads2024-08-17 11:19:16.2982638 [E:onnxruntime:, inference_session.cc:1645 onnxruntime::InferenceSession::Initialize::<lambda_eb486adf513608dcd45c034ea7ffb8e8>::operator ()] Exception during initialization: D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:121 onnxruntime::CudaCall D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:114 onnxruntime::CudaCall CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=DESKTOP-IH921FK ; file=D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_execution_provider.cc ; line=164 ; expr=cublasCreate(&cublas_handle_);


[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:121 onnxruntime::CudaCall D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:114 onnxruntime::CudaCall CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=DESKTOP-IH921FK ; file=D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_execution_provider.cc ; line=164 ; expr=cublasCreate(&cublas_handle_);


Processing:  25%|▏| 25/102 [00:03<00:11,  6.43frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads
Exception in Tkinter callback
Traceback (most recent call last):
  File "T:\Deep-Live-Cam-cuda\python\lib\tkinter\__init__.py", line 1921, in __call__
    return self.func(*args)
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\customtkinter\windows\widgets\ctk_button.py", line 554, in _clicked
    self._command()
  File "T:\Deep-Live-Cam-cuda\modules\ui.py", line 95, in <lambda>
    start_button = ctk.CTkButton(root, text='Start', cursor='hand2', command=lambda: select_output_path(start))
  File "T:\Deep-Live-Cam-cuda\modules\ui.py", line 192, in select_output_path
    start()
  File "T:\Deep-Live-Cam-cuda\modules\core.py", line 200, in start
    frame_processor.process_video(modules.globals.source_path, temp_frame_paths)
  File "T:\Deep-Live-Cam-cuda\modules\processors\frame\face_swapper.py", line 86, in process_video
    modules.processors.frame.core.process_video(source_path, temp_frame_paths, process_frames)
  File "T:\Deep-Live-Cam-cuda\modules\processors\frame\core.py", line 72, in process_video
    multi_process_frame(source_path, frame_paths, process_frames, progress)
  File "T:\Deep-Live-Cam-cuda\modules\processors\frame\core.py", line 64, in multi_process_frame
    future.result()
  File "T:\Deep-Live-Cam-cuda\python\lib\concurrent\futures\_base.py", line 458, in result
    return self.__get_result()
  File "T:\Deep-Live-Cam-cuda\python\lib\concurrent\futures\_base.py", line 403, in __get_result
    raise self._exception
  File "T:\Deep-Live-Cam-cuda\python\lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "T:\Deep-Live-Cam-cuda\modules\processors\frame\face_swapper.py", line 65, in process_frames
    source_face = get_one_face(cv2.imread(source_path))
  File "T:\Deep-Live-Cam-cuda\modules\face_analyser.py", line 20, in get_one_face
    face = get_face_analyser().get(frame)
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\insightface\app\face_analysis.py", line 75, in get
    model.get(img, face)
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\insightface\model_zoo\arcface_onnx.py", line 67, in get
    face.embedding = self.get_feat(aimg).flatten()
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\insightface\model_zoo\arcface_onnx.py", line 84, in get_feat
    net_out = self.session.run(self.output_names, {self.input_name: blob})[0]
  File "T:\Deep-Live-Cam-cuda\python\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 217, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:121 onnxruntime::CudaCall D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_call.cc:114 onnxruntime::CudaCall CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=DESKTOP-IH921FK ; file=D:\a\_work\1\s\onnxruntime\core\providers\cuda\cuda_stream_handle.cc ; line=50 ; expr=cublasCreate(&cublas_handle_);

This above was with a 3 second 1280x720 MP4, but I also created a half second, 20 frame, 360x240 MP4 (using ffmpeg) which failed after 50% - I also tried a 50 frame video which got as far as 73% before failing. Any idea why this is happening?

Aug 17 '24 10:08 theSplund

Similar problem for me. I reinstalled everything today and everything works with pictures.

With videos, however, the GUI hangs and shows no progress. In Powershell, however, everything runs and at 100% it aborts without comment and the GUI closes. The preview works.

Ryzen 3700X 32GB DDR4 RTX 3080 TI

Aug 20 '24 11:08 Pascal2708

So for me I had to debug it a bit.. and lucky I was able to find it.

in the repo, go to modules > utilties.py where you will find def create_video method, I had to change this a bit to figure out why the video was not getting created, so likely chances are that your ffmpeg doesn't have the necessary encoder installed (for me it was that).

in terminal run : ffmpeg -encoders | grep This will show what encoders are installed, for this repo the encoders are :

(default) libx264 <-- default encoder
libx265, libvpx-vp9 <-- these are also set as 'choices'.. not too sure on the code logic on how it selects it.

I didn't have the default encoder in my ffmpeg (i used pip install ffmpeg), so I had to install it another way: try : (macOS) brew install ffmpeg || (conda) conda install -c conda-forge ffmpeg

Sep 01 '24 15:09 D2OKAY

I guess that as you've used the terms 'terminal' and 'grep' that this is Linux orientated? I couldn't get it to run in Linux, yet, only Win10, but I did check 'ffmpeg -encoders' in a command window (grep wasn't allowed) and these were amongst the reported: V....D libx264 libx264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (codec h264) and V....D libx265 libx265 H.265 / HEVC (codec hevc) and V....D libvpx-vp9 libvpx VP9 (codec vp9) so that might not be the fix. The fact that the application appears to successfully convert the movie into .png files but then fails part way through the process of replacing faces in the .pngs leads me to suspect that a codec issue isn't at the heart of it (TBH I suspect a memory issue, but it's a guess). Thanks anyway

Sep 01 '24 22:09 theSplund

Well, I got bored with this and decided to save some SSD space, and so I moved it to an HDD. A week or so later I tried it again, and it works with movies! Very strange. Closing it

Sep 25 '24 22:09 theSplund

I'm facing the same issue. On the stable release. Arch Linux, cuda 11.8.0-1, python 3.10. Live works, single image works, but processing a video does not.

This is the output I'm getting when calling

python run.py --execution-provider cuda --source face.jpg --target ~/Videos/vid.mp4 --output ~/Desktop/out.mp4 --keep-audio --keep-fps

Show log

Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}}
find model: /home/lin/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}}
find model: /home/lin/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}}
find model: /home/lin/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}}
find model: /home/lin/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}}
find model: /home/lin/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)
[DLC.CORE] Processing...
[DLC.CORE] Creating temp resources...
[DLC.CORE] Extracting frames...
[DLC.FACE-SWAPPER] Progressing...
Processing:   0%|                                                                                                      | 0/153 [00:00<?, ?frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16]Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_tuning_enable': '0', 'device_id': '0', 'has_user_compute_stream': '0', 'gpu_mem_limit': '18446744073709551615', 'gpu_external_alloc': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'cudnn_conv1d_pad_to_nc1d': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'do_copy_in_default_stream': '1', 'enable_cuda_graph': '0', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0'}}
inswapper-shape: [1, 3, 128, 128]
2024-10-29 03:22:54.624239011 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

2024-10-29 03:22:54.628876184 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

2024-10-29 03:22:54.634167991 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

2024-10-29 03:22:54.636532203 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

Processing:   1%|▌                                                                                             | 1/153 [00:01<04:25,  1.75s/frame, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16][ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

2024-10-29 03:22:54.640195315 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

2024-10-29 03:22:54.644583342 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

2024-10-29 03:22:54.649134524 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792
[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_62' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 104865792


2024-10-29 03:22:54.798665932 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 


2024-10-29 03:22:54.851053085 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 


2024-10-29 03:22:54.858262199 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 


2024-10-29 03:22:55.222706892 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 


[ONNXRuntimeError] : 1 : FAIL : /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=51 ; expr=cublasCreate(&cublas_handle_); 


Processing:   5%|████▉                                                                                         | 8/153 [00:02<00:34,  4.19frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16][ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 


2024-10-29 03:22:55.388265396 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_46' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 18884864

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_46' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 18884864


# ... and so on ...

Processing:  80%|█████████████████████████████████████████████████████████████████████████▉                  | 123/153 [00:08<00:01, 18.32frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16]2024-10-29 03:19:19.275439003 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

2024-10-29 03:19:19.294210553 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

2024-10-29 03:19:19.308650151 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fc1' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 


2024-10-29 03:19:19.349116559 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Gemm node. Name:'fullyconnected0' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 


[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Gemm node. Name:'fullyconnected0' Status Message: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=168 ; expr=cublasCreate(&cublas_handle_); 


Processing:  82%|███████████████████████████████████████████████████████████████████████████▊                | 126/153 [00:08<00:01, 19.09frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16]2024-10-29 03:19:19.414915318 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

2024-10-29 03:19:19.450319818 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

2024-10-29 03:19:19.535050928 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

Processing:  84%|█████████████████████████████████████████████████████████████████████████████▌              | 129/153 [00:09<00:01, 18.18frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16]2024-10-29 03:19:19.553050835 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

2024-10-29 03:19:19.622111305 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

2024-10-29 03:19:19.641810931 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

2024-10-29 03:19:19.642229023 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

Processing:  86%|███████████████████████████████████████████████████████████████████████████████▎            | 132/153 [00:09<00:01, 20.29frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16][ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'Conv_107' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 136723103642368

Processing:  87%|███████████████████████████████████████████████████████████████████████████████▉            | 133/153 [00:09<00:01, 14.56frame/s, execution_providers=['CUDAExecutionProvider'], execution_threads=8, max_memory=16]
Traceback (most recent call last):
  File "/home/lin/Desktop/Deep-Live-Cam/run.py", line 6, in <module>
    core.run()
  File "/home/lin/Desktop/Deep-Live-Cam/modules/core.py", line 252, in run
    start()
  File "/home/lin/Desktop/Deep-Live-Cam/modules/core.py", line 209, in start
    frame_processor.process_video(modules.globals.source_path, temp_frame_paths)
  File "/home/lin/Desktop/Deep-Live-Cam/modules/processors/frame/face_swapper.py", line 174, in process_video
    modules.processors.frame.core.process_video(source_path, temp_frame_paths, process_frames)
  File "/home/lin/Desktop/Deep-Live-Cam/modules/processors/frame/core.py", line 73, in process_video
    multi_process_frame(source_path, frame_paths, process_frames, progress)
  File "/home/lin/Desktop/Deep-Live-Cam/modules/processors/frame/core.py", line 65, in multi_process_frame
    future.result()
  File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/lin/Desktop/Deep-Live-Cam/modules/processors/frame/face_swapper.py", line 133, in process_frames
    source_face = get_one_face(cv2.imread(source_path))
  File "/home/lin/Desktop/Deep-Live-Cam/modules/face_analyser.py", line 28, in get_one_face
    face = get_face_analyser().get(frame)
  File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/site-packages/insightface/app/face_analysis.py", line 59, in get
    bboxes, kpss = self.det_model.detect(img,
  File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/site-packages/insightface/model_zoo/retinaface.py", line 224, in detect
    scores_list, bboxes_list, kpss_list = self.forward(det_img, self.det_thresh)
  File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/site-packages/insightface/model_zoo/retinaface.py", line 152, in forward
    net_outs = self.session.run(self.output_names, {self.input_name : blob})
  File "/home/lin/.pyenv/versions/3.10.15/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudnnStatus_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDNN failure 4: CUDNN_STATUS_INTERNAL_ERROR ; GPU=0 ; hostname=Vanillarch ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_stream_handle.cc ; line=53 ; expr=cudnnCreate(&cudnn_handle_);

So I'd like to ask for this issue to be reopened :)

Oct 29 '24 02:10 Lxtharia

Apparently it works when setting the execution threads to 4

Oct 29 '24 19:10 Lxtharia

Deep-Live-Cam Deep-Live-Cam copied to clipboard

Live works, single image works, video conversion doesn't

Deep-Live-Cam
Deep-Live-Cam copied to clipboard