libobs: Add Source Profiling
Description
This PR adds libobs plumbing for profiling the performance of individual sources.
- [x] OpenGL (works on NVIDIA and AMD on Windows, does not work on macOS (Apple doesn't implement the GL command), untested on Linux)
- [x] Figure out what to do about views/mixes
- Just collect all of them as samples for the current frame, it's a bit confusing, but there's no good way around it.
A GUI is being worked on separately, a build with the previous WIP GUI can be downloaded here: https://github.com/derrod/obs-studio/actions/runs/7045389326
Motivation and Context
Diagnosing performance issues with OBS can be difficult. In many cases users get told to duplicate their current scene collection and delete stuff until things improve. That's obviously not ideal.
This pull request introduces an API for determining the performance characteristics of individual sources to find outliers and problematic ones, e.g. a complex shader filter that takes too long to render.
Additionally it can help spot bottlenecks with asynchronous sources, e.g. if a capture card or video does not reach its full frame rate due to resource limitations.
A UI will follow later once the libobs interface has been reviewed and finalised.
How Has This Been Tested?
Rendered some sources, observed the numbers changing (Windows 11, AMD RDNA2, DX11 and OpenGL).
Types of changes
- New feature (non-breaking change which adds functionality)
Checklist:
- [x] My code has been run through clang-format.
- [x] I have read the contributing document.
- [x] My code is not on the master branch.
- [x] The code has been tested.
- [x] All commit messages are properly formatted and commits squashed where appropriate.
- [x] I have included updates to all appropriate documentation.
Would it be possible to have percentage columns for both cpu and gpu? I believe this would be what most users would understand.
Would it be possible to have percentage columns for both cpu and gpu? I believe this would be what most users would understand.
Not really because they wouldn't add up to 100, for example a scene's render time also contains all the items contained within. And with GPU stuff it's not directly cumulative as far as I understand. And since this meant for power users it's probably fine for it to be more technical/accurate.
Could there be an option to hide values that read 0.00? Some task manager alternatives do this, like Bitsum Process Lasso, it's much easier to read since there's no visual noise in the form of zeroes.
We're back after an unscheduled interruption due to me messing up a rebase somehow.
To move things along I removed the UI part of this PR for now so we can focus on reviewing the backend changes that allow for per-source profiling.
I just want to add that this is very needed. As a heavy user of OBS scenes, plugins, transitions and filters, I would love to know which one I can remove to have a reasonable resource saving and rethink my configuration. Sometimes removing one plugin is way more worthy than removing other 5s or so. Or reconfiguring some scenes.
Even if we get this only on logs (text or something) for the first release, it is worth merging.
I just want to add that this is very needed. As a heavy user of OBS scenes, plugins, transitions and filters, I would love to know which one I can remove to have a reasonable resource saving and rethink my configuration. Sometimes removing one plugin is way more worthy than removing other 5s or so. Or reconfiguring some scenes.
Even if we get this only on logs (text or something) for the first release, it is worth merging.
Totally agree, and yes please! I've had rogue shaders and filters that were eating 80% of my GPU, but I thought everything I was adding after that was causing the issues (because it didn't lag until those were added). This would be extremely useful!
Question: Would it be possible to expose these metrics right now, via something like obs-websocket, or obslua, or at least via "dump what we've gathered at shutdown time"? That would make this change both useful and more testable, without giving it a dependency on figuring out the proper UI for it (since UI changes take much longer and in this case are probably harder to do than the actual profiling itself).
This seems like a very much needed feature, and it doesn't seem like there should be any hard requirement on having a UI for it finished (or even designed), for it to be merged and be extremely useful. Plus, if we can merge this and make the data actually accessible, it's virtually guaranteed that someone will come along and write some tools for making use of that data, even before there's an official UI.
Maybe that's a middle ground between "don't have it at all" and "waiting on a full-blown UI"?