obs-studio
obs-studio copied to clipboard
UHD output performance to UltraStudio 4K Mini is unusable on macOS Intel, perfect on M1
Operating System Info
macOS 12
Other OS
No response
OBS Studio Version
Other
OBS Studio Version (Other)
27
OBS Studio Log URL
https://obsproject.com/logs/9PmYBd-le9uIxrSR
OBS Studio Crash Log URL
No response
Expected Behavior
Smooth rendering of key/fill playback on output
Current Behavior
Stuttered output that drops in performance quickly. Set to 29.97 it's playing maybe 20 fps but quickly drops to 1 fps and lower.
Steps to Reproduce
- macOS 12.4 Intel Mac Pro, OBS 27.2.4
- install thunderbolt Blackmagic UltraStudio 4K Mini,
- Set OBS to 3840x2160 29.97 canvas and output, enable RGB in advanced settings
- enable keyed output for UltraStudio. Play something that has alpha channel, and motion
- Performance starts slow, and gets worse. Totally unusable.
Anything else we should know?
- Exact same setup (Intel Mac Pro) but using Decklink 8K card instead of UltraStudio works perfectly fine
- Exact same setup (UltraStudio 4K Mini) but using M1 MacBook Air instead of Mac Pro works perfectly fine
Just to point out the inevitable: Given that reproducing this issue requires access to pretty expensive hardware, it might take some time before someone with the exact same setups available might be able to tackle this issue - hence there might be quite some inactivity before this might be fixed.
Understood. I could potentially allow remote access to my system for testing if someone knows what to look for.
I believe @Fenrirthviti has both an UltraStudio 4K Mini and an Intel Mac (but no M1 machine).
I do have the UltraStudio, but as mentioned, I only have access to an Intel mac at present.
That is where the problem is; on Intel. It works fine on M1.
I'm testing the other issue in the next week or so, so I'll run through this one at the same time.
Can confirm this issue also exists in some form in Windows with the Ultrastudio 4K Mini. 1080p60 Decklink output works perfectly, but 2160p60 results in a garbled picture. Rolled all the way back to OBS 29.0.2 and the issue is still present.
Approaching a year since this issue was introduced. Still no movement?
Apologies, I thought I had provided an update but got pulled away on other issues and forgot about this.
I wasn't able to replicate this issue, but I also don't have enough understanding on what is actually expected to happen, or how to investigate further myself. I am either doing something wrong in the test that I don't understand, or am not experiencing this issue myself.
Are you saying you are able to use the Decklink Output at 2160p60 with version 29.1.3? Version 28.1.2 technically "works" for me, albeit with an insane amount of encoding lag even though neither CPU or GPU are under full load. Version 29.1.3 outputs a corrupted image through the Decklink Output at 2160p60, although the encoding lag is no longer a problem. 1080p60 works fine. 2160p60 Decklink Output seems to have been broken after version 28. Every Decklink Output Mode up to 2kp60 DCI works, 2160p60 and 4kp60DCI output a corrupted image.
Here's the image with the Decklink Output setting at 1080p60:
https://i.imgur.com/NeZEAwD.jpg
Here's the same image with the Decklink Output setting at 2160p60:
https://i.imgur.com/5w8xhkZ.jpg
I've tested again, and I can't get 2160p60 to work in any version of OBS, it always shows a garbled output.
1080p60 appears to work ok.
My input source is a PS5, just for reference.
I've tested again, and I can't get 2160p60 to work in any version of OBS, it always shows a garbled output.
1080p60 appears to work ok.
My input source is a PS5, just for reference.
Try rolling back to OBS 28.1.2. You should get a proper image, but also get crippling encoding lag.
Now that we've confirmed this is a real issue, how do we get the necessary attention to get it fixed?
As with anything, someone with the requisite experience and knowledge, time, and desire to work on it. Unfortunately, that is not me, as I don't have any experience with the DeckLink SDK or development in this space of the program.
Apparently @DDRBoxman is the author of this part of OBS Studio. Tagging him here for some visibility.
2160p60 Decklink Output with an Ultrastudio 4K Mini is still broken in OBS 30 Beta 1. Tagging @jpark37
I have an Intel Mac Pro but not a 4K studio mini so I can’t test this setup.
I have an Intel Mac Pro but not a 4K studio mini so I can’t test this setup.
I've also repro'd this on a Windows PC. @Fenrirthviti has repro'd it as well as seen in the messages here.
I also have an Ultra Studio 4K Mini and cannot get it to output a 2160p60 signal. The display on the device shows a distorted image. 1080p60 works as mentioned before. I get the same result with a MacBook Pro M2 Pro, a Mac Studio M1 Max or a Lenovo Thinkpad X1 Carbon. The OBS versions I tried are 29.1 and 28.1. What can I do to help solve this issue?
P.S. I also tried a different software (Mimolive) with a similar result, so this might be connected to the BMD driver itself. Unfortunately I have tested a few driver versions but never had any success and since my Ultra Studio is rather new I have never seen a working combination that I can revert to.
Same issue here! I don't understand how it can work on other programs (like pro tools, premiere, Final Cut, etc.) and not on others. I can only guess OBS and other broken apps use an older BMD SDK version that did not support 2160p30 and above.
@polyh3dron seems totally weird to have it work on v28.1.2. Others including me had no luck with it.
Sadly OBS 30.1.0 does not include a fix. I'm pretty convinced the culprit is ffmpeg library which is used to output to decklink devices. Maybe different flags are needed to switch to quad link SDI (12G) or even a new binary?
Okay so this has turned into two issues, I'm making a new issue for the distorted image: https://github.com/obsproject/obs-studio/issues/10380
Performance issues should stay on this existing thread.
We're hitting a performance bottleneck here when we download the texture after the GPU scale which shows up with larger resolutions.
https://github.com/obsproject/obs-studio/blob/21f1c155ef33f176c4065868a6edc7951708ee49/UI/frontend-plugins/decklink-output-ui/decklink-ui-main.cpp#L420
I'm going to try to see if I can get some sort of texture ping pong setup working here so we aren't blocking on this call
We're already doing that, not sure why but the texture download off the GPU seems to be blocking when it should be async 🤔
I may have run into a similiar situation on Linux. Though, in my case the performance bottleneck is at the memcpy after the texture is mapped. The memcpy takes 3ms, which seems rather slow on a PCIe 3.0 GPU for a 1920x1080 image. My OpenGL is not that fluid, but either that way of downloading textures is not efficient, or there are outstanding GPU commands, which have to complete before.
If there is no way to speed up this blocking copy, this should happen asynchronously. But besides: Is it really necessary to download the texture from GPU here? Isn't this already done by libobs somewhere else in order to feed normal video output plugins?
Okay, I did some more experiments. So it doesn't appear to be an issue with outstanding GPU commands or anything. If I only copy half the data, the copy will be twice as fast. So my conclusion is that for whatever reason reading from the mapped texture memory is just slow. Perhaps I'll try to use a normal glGetTexImage to see if it's faster.
Edit: Okay, I've tried glGetTexImage. But it is only marginally faster (~0.5 ms improvement). So I still have no idea for a good solution.
Edit 2: Okay, interestingly Windows doesn't appear to have this issue. Or at least the reading after ID3D11DeviceContext::Map is about 3 to 4 times as fast with glMapBuffer (on the same hardware). Not sure what different behavior we are running into there.
There are quite a few copies involved:
- First a frame from the video cache is copied into the
output_frame - Then the rendered texture is copied into a stage texture (with the expectation that the GPU will do that copy immediately, which will cause trouble at some point)
- Then the texture data is downloaded into a buffer (either a
PIXEL_PACK_BUFFERin OpenGL or as aD3D11_MAPPED_SUBRESOURCEin Direct3D) - In theory this mapping (and thus download of texture data) should not block because a separate stage texture was created which should neither be sampled from in any shader or written to by any draw calls, though maybe it is blocked until the copy operation has actually taken place (see above - might be API specific)
- And finally the texture data is copied into the
output_frameagain
Because the pointers (and associated memory) holding the texture data are created by the graphics APIs, they are owners of that data, and once gs_stagesurface_unmap is called, the APIs can and will deallocate the memory at some point. So that final copy is required and cannot be avoided, but the first copy (of the cached frame data into the new frame) is unnecessary as the data is supposed to be fully replaced by the staged texture data anyway.