obs-studio icon indicating copy to clipboard operation
obs-studio copied to clipboard

CUDA_ERROR_OUT_OF_MEMORY after enough Stop Replay Buffer > Start Replay Buffer

Open ipaqmaster opened this issue 6 months ago • 9 comments

Operating System Info

Other

Other OS

Archlinux - kernel 6.12.28-1-lts

OBS Studio Version

31.0.3

OBS Studio Version (Other)

OBS Studio - 31.0.3

OBS Studio Log URL

2025-05-12 21-19-28.txt

OBS Studio Crash Log URL

NA

Expected Behavior

Start the Replay Buffer

Current Behavior

Stopping and Starting the replay buffer should not result in the attached image on next Start

Image

cuda_surface_init: CUDA call "cu->cuArray3DCreate(&nvsurf->tex,&desc)" failed with CUDA_ERROR_OUT_OF_MEMORY (2): o ut of memory [OK]"

During CS2 gameplay after a good 10-20 Stop-Start replay buffers.

Steps to Reproduce

  1. Play CS2
  2. Stop and Re-Start the Replay Buffer @ round start during gameplay
  3. Eventually OBS throws cuda_surface_init: CUDA call "cu->cuArray3DCreate(&nvsurf->tex, &desc)" failed with CUDA_ERROR_OUT_OF_MEMORY (2): out of memory" and must be restarted.

Is this a traditional memory leak scenario?

Anything else we should know?

No response

ipaqmaster avatar May 12 '25 09:05 ipaqmaster

On the 2080Ti this happens on exactly the 29th Start Replay Buffer press. Requiring a restart of the application with a 2560x1440 window of CS2 being captured.

ipaqmaster avatar May 12 '25 09:05 ipaqmaster

Watching (watch -d -n1 nvidia-smi) as I mash Start Replay Buffer and Stop Replay Buffer I can clearly see the application (obs) is not freeing all its full video memory upon stopping the replay buffer. Every time I mash the button the overall minimum memory usage increases by 100-200MB. Each Start>stop.

ipaqmaster avatar May 12 '25 09:05 ipaqmaster

@ipaqmaster you should include a log if you want this to be actionnable.

Can repro on Ubuntu 24.04, on master obs. I can similarly see the vram usage increasing in nvtop till it maxes out, at which point the encoder start error happens. https://obsproject.com/logs/aR8UMAkPuhP6485X

Two additional notes after testing :

  • the amount of memory "not released" seems directly tied to the output resolution, I can make it happen after just three start/stops when setting it to 4096x4096
  • it happens with streams and recordings too

Penwy avatar May 12 '25 11:05 Penwy

I am seeing the same growth when my OBS preview is fully-black. It happened much sooner when Counter-Strike 2 was on display - content to fill the framebuffer instead of blackness.

But ultimately after a good 50-100 toggles of the framebuffer I got it again:

cuda_surface_init: CUDA call "cu->cuArray3DCreate(&nvsurf->tex, &desc)" failed with CUDA_ERROR_OUT_OF_MEMORY (2): out of memory

ipaqmaster avatar May 12 '25 11:05 ipaqmaster

Here is my log from this launch with its CUDA_ERROR_OUT_OF_MEMORY experience around the 9473MiB mark

Note I have a a named Xcomposite window capture Source for CS2 and TF2 configured in my Scene here.

On a black screen (None of the Xcomposite's are running) it took aproximately 126 toggles via hotkey to throw the CUDA_ERROR_OUT_OF_MEMORY error)

2025-05-12 21-19-28.txt

ipaqmaster avatar May 12 '25 11:05 ipaqmaster

I've also experienced this error in OBS on a Linux system where I am running a long-running OBS stream. It's not related to CS2 gameplay for me, but occurred for me after leaving two OBS instances running a 24/7 stream for a period of 2-3 days.

Due to the nature of my setup, this is a Proxmox VM using GPU passthrough. I assumed that might somehow be causing my issues - but I'm glad to hear it most likely isn't, since someone else is having a similar problem. I've been having this issue for awhile, but due to the headless nature of the VM, at first I was just force rebooting the VM and not investigating any further. Now I will be monitoring the issue more closely, and if I find anything noteworthy that might help with this investigation, I will update here.

My stream is just an audio capture and a single window capture, but I am running it on two OBS instances going to two different streaming services. After running for several days, I saw OBS in a disconnected state, apparently trying to reconnect the stream. Upon ending the stream and trying to reconnect to the streaming platform manually, I first got an error that said:

NVENC Error: Too many concurrent sessions. Try closing other recording software that might be using NVENC such as NVIDIA ShadowPlay or Windows Game DVR.

Then, after a second attempt to restart, I get the same error as the original post:

cuda_surface_init: CUDA call "cu->cuArray3DCreate(&nvsurf->tex,&desc)" failed with CUDA_ERROR_OUT_OF_MEMORY (2): out of memory

I was running nVidia 550 drivers at the time (not sure my kernel version at the time as I discovered this thread after the fact and didn't bother checking).

I have now upgraded the nVidia drivers to 570.133.07. I'm on kernel 6.8.0-59-generic, on an Ubuntu based system, and also on OBS 31.0.3.

I've rebooted the system cleanly after performing these updates and restarted the streams. I will leave it running now with this software\driver\kernel version and see if it remains stable or crashes again in a few days. I'm also leaving nvidia-smi open so that next time I experience a crash, maybe I can collect more info to see if something is like, leaking GPU memory, or something like that. I've archived a screenshot of my current nvidia-smi state for comparison if I uncover anything useful upon my next crash.

ufgkirk avatar May 14 '25 20:05 ufgkirk

I've also experienced this error in OBS on a Linux system where I am running a long-running OBS stream. It's not related to CS2 gameplay for me, but occurred for me after leaving two OBS instances running a 24/7 stream for a period of 2-3 days.

Due to the nature of my setup, this is a Proxmox VM using GPU passthrough. I assumed that might somehow be causing my issues - but I'm glad to hear it most likely isn't, since someone else is having a similar problem. I've been having this issue for awhile, but due to the headless nature of the VM, at first I was just force rebooting the VM and not investigating any further. Now I will be monitoring the issue more closely, and if I find anything noteworthy that might help with this investigation, I will update here.

My stream is just an audio capture and a single window capture, but I am running it on two OBS instances going to two different streaming services. After running for several days, I saw OBS in a disconnected state, apparently trying to reconnect the stream. Upon ending the stream and trying to reconnect to the streaming platform manually, I first got an error that said:

NVENC Error: Too many concurrent sessions. Try closing other recording software that might be using NVENC such as NVIDIA ShadowPlay or Windows Game DVR.

Then, after a second attempt to restart, I get the same error as the original post:

cuda_surface_init: CUDA call "cu->cuArray3DCreate(&nvsurf->tex,&desc)" failed with CUDA_ERROR_OUT_OF_MEMORY (2): out of memory

I was running nVidia 550 drivers at the time (not sure my kernel version at the time as I discovered this thread after the fact and didn't bother checking).

I have now upgraded the nVidia drivers to 570.133.07. I'm on kernel 6.8.0-59-generic, on an Ubuntu based system, and also on OBS 31.0.3.

I've rebooted the system cleanly after performing these updates and restarted the streams. I will leave it running now with this software\driver\kernel version and see if it remains stable or crashes again in a few days. I'm also leaving nvidia-smi open so that next time I experience a crash, maybe I can collect more info to see if something is like, leaking GPU memory, or something like that. I've archived a screenshot of my current nvidia-smi state for comparison if I uncover anything useful upon my next crash.

~~When you get this error, would it be possible to run nvidia-smi encodersessions? The OOM error might be caused also due to the limited number encoder sessions on Geforce. It would be good to see whether there are old encoder session that fail to be closed properly or whether it is really out of video memory (as reported with plain nvidia-smi).~~ Forget what I said. This would only be an explanation om failure of the encode API, not the CUDA API.

theHamsta avatar May 14 '25 20:05 theHamsta

So far, no more crashes on my end after the driver update. It's been running since I posted 3 days ago. I'll continue to monitor, I do think it's been stable for longer amounts of time than this before, but my previous crash occurred sooner than this.

ufgkirk avatar May 18 '25 01:05 ufgkirk

I experienced the crash again today. It seems to me that one of my two OBS instances must have experienced some kind of memory leak. One of them was using 2.6GB of video memory, and continued to do so even after clicking Stop Stream.

+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.133.07 Driver Version: 570.133.07 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce GTX 1650 ... Off | 00000000:01:00.0 On | N/A | | 0% 52C P0 29W / 100W | 3617MiB / 4096MiB | 36% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 1142 G /usr/lib/xorg/Xorg 148MiB | | 0 N/A N/A 1481 G /usr/bin/ksmserver 2MiB | | 0 N/A N/A 1483 G /usr/bin/kded5 2MiB | | 0 N/A N/A 1484 G /usr/bin/kwin_x11 63MiB | | 0 N/A N/A 1539 G ...it-kde-authentication-agent-1 2MiB | | 0 N/A N/A 1541 G ...ibexec/xdg-desktop-portal-kde 2MiB | | 0 N/A N/A 1654 G /usr/bin/nextcloud 7MiB | | 0 N/A N/A 1705 G /usr/bin/kaccess 2MiB | | 0 N/A N/A 1717 G ...-gnu/libexec/DiscoverNotifier 2MiB | | 0 N/A N/A 2328 G ...4-linux-gnu/libexec/kf5/kiod5 2MiB | | 0 N/A N/A 2369 G /usr/bin/kwalletd5 2MiB | | 0 N/A N/A 20752 G /usr/bin/konsole 2MiB | | 0 N/A N/A 2506386 G ...linux-gnu/libexec/baloorunner 2MiB | | 0 N/A N/A 2794601 G .../6159/usr/lib/firefox/firefox 8MiB | | 0 N/A N/A 2798318 G /usr/bin/vlc 2MiB | | 0 N/A N/A 2798667 C+G obs 699MiB | | 0 N/A N/A 2798716 G obs 2609MiB | | 0 N/A N/A 2798891 G projectM-pulseaudio 16MiB | | 0 N/A N/A 4163465 G /usr/bin/plasmashell 13MiB | +-----------------------------------------------------------------------------------------+

I was able to exit that OBS instance cleanly (File -> Exit), and launch it again, and start the stream again. Now my video memory usage for both instances appears more reasonable, although the OBS instance I didn't restart is using nearly twice as much video memory.

+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.133.07 Driver Version: 570.133.07 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce GTX 1650 ... Off | 00000000:01:00.0 On | N/A | | 0% 52C P0 30W / 100W | 1538MiB / 4096MiB | 33% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 1142 G /usr/lib/xorg/Xorg 184MiB | | 0 N/A N/A 1481 G /usr/bin/ksmserver 2MiB | | 0 N/A N/A 1483 G /usr/bin/kded5 2MiB | | 0 N/A N/A 1484 G /usr/bin/kwin_x11 79MiB | | 0 N/A N/A 1539 G ...it-kde-authentication-agent-1 2MiB | | 0 N/A N/A 1541 G ...ibexec/xdg-desktop-portal-kde 2MiB | | 0 N/A N/A 1654 G /usr/bin/nextcloud 7MiB | | 0 N/A N/A 1705 G /usr/bin/kaccess 2MiB | | 0 N/A N/A 1717 G ...-gnu/libexec/DiscoverNotifier 2MiB | | 0 N/A N/A 2328 G ...4-linux-gnu/libexec/kf5/kiod5 2MiB | | 0 N/A N/A 2369 G /usr/bin/kwalletd5 2MiB | | 0 N/A N/A 20752 G /usr/bin/konsole 2MiB | | 0 N/A N/A 53570 C+G rustdesk 132MiB | | 0 N/A N/A 54466 G /app/share/rustdesk/rustdesk 12MiB | | 0 N/A N/A 55138 C+G obs 338MiB | | 0 N/A N/A 2506386 G ...linux-gnu/libexec/baloorunner 2MiB | | 0 N/A N/A 2798318 G /usr/bin/vlc 2MiB | | 0 N/A N/A 2798667 C+G obs 699MiB | | 0 N/A N/A 2798891 G projectM-pulseaudio 16MiB | | 0 N/A N/A 4163465 G /usr/bin/plasmashell 18MiB | +-----------------------------------------------------------------------------------------+

Must be some kind of video memory leak?

ufgkirk avatar May 20 '25 19:05 ufgkirk

Still experiencing this

OBS Studio - 31.1.2

ipaqmaster avatar Aug 13 '25 12:08 ipaqmaster