obs-studio obs-ffmpeg: Release encode texture early

Description

During high graphics thread pressure it can take a significant time to acquire the graphics lock. This change releases the OpenGL texture after rendering to avoid the 2nd lock after sending the frame to FFmpeg. This improves 99%-tile/100%-tile and median encode in a near encoder overload scenario, and modestly raises the ceiling before encoder overload in my test scene.

Motivation and Context

Random dropped frames are no fun.

Master: min=0 ms, median=4.29 ms, max=33.072 ms, 99th percentile=8.877 ms min=0 ms, median=4.438 ms, max=77.157 ms, 99th percentile=9.853 ms min=0 ms, median=4.527 ms, max=57.292 ms, 99th percentile=9.282 ms

This commit: min=0.97 ms, median=3.009 ms, max=13.215 ms, 99th percentile=5.899 ms min=1.181 ms, median=2.91 ms, max=9.854 ms, 99th percentile=5.56 ms min=0.461 ms, median=3.013 ms, max=10.693 ms, 99th percentile=5.871 ms

How Has This Been Tested?

Compared recordings before and after, and this seems semantically correct as its what the current software encoder does. Though a review from @nowrep or anyone who actually knows the semantics of ffmpeg would be appreciated.

Types of changes

Performance enhancement (non-breaking change which improves efficiency)

Checklist:

[x] My code has been run through clang-format.
[x] I have read the contributing document.
[x] My code is not on the master branch.
[x] The code has been tested.
[x] All commit messages are properly formatted and commits squashed where appropriate.
[x] I have included updates to all appropriate documentation.

Apr 21 '24 16:04 kkartaltepe

This breaks when the encoder isn't done with the previously submitted frame (eg. when it needs to change encode order with B-frames).

If the issue is about taking the graphics thread lock twice, then it could be changed to do both the texture import and copy in one lock. Another option would be to cache the textures for VASurface (ffmpeg reuses VA surfaces from pool).

Apr 21 '24 17:04 nowrep

If the issue is about taking the graphics thread lock twice, then it could be changed to do both the texture import and copy in one lock

~~These are in one lock~~ So fast its almost never really preempted, its releasing the texture once we have submitted the frame to ffmpeg that requires the lock again. We also dont want to hold the graphics lock while sending the frame to ffmpeg since that will stall rendering.

Another option would be to cache the textures for VASurface (ffmpeg reuses VA surfaces from pool).

This what I do for QSV, if you have an example on enumerating the ffmpeg frame pool I can do that too.

Apr 21 '24 17:04 kkartaltepe

These are in one lock So fast its almost never really preempted, its releasing the texture once we have submitted the frame to ffmpeg that requires the lock again. We also dont want to hold the graphics lock while sending the frame to ffmpeg since that will stall rendering, the .

All this could be done in that one required lock that does the copy before encode.

This what I do for QSV, if you have an example on enumerating the ffmpeg frame pool I can do that too.

It still needs to create new AVFrame every frame, but it would only do vaExportSurfaceHandle + texture import once for every VASurfaceID (frame->data[3]).

Apr 21 '24 17:04 nowrep

All this could be done in that one required lock that does the copy before encode.

Sorry, now i see what you are saying. This actually this performs even better.

Apr 21 '24 18:04 kkartaltepe

Looks good now, thanks.

Apr 21 '24 19:04 nowrep

Thanks for the review

Apr 21 '24 20:04 kkartaltepe

obs-studio obs-studio copied to clipboard

obs-ffmpeg: Release encode texture early

Description

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

obs-studio
obs-studio copied to clipboard