jellyfin-ffmpeg icon indicating copy to clipboard operation
jellyfin-ffmpeg copied to clipboard

avfilter/vf_overlay_videotoolbox: add fast code path for bgra overlay

Open gnattu opened this issue 1 month ago • 0 comments

The previous implementation needed to convert both main and overlay frames to BGRA texture and then convert back to YUV. This operation is bandwidth heavy.

Add a faster shader when the overlay is in BGRA format which calculates YUV values in the shader. This eliminates the need to convert the main frame and does not require extra copy for the overlay frame, leading to more than 100% performance improvements overlaying 10-bit 1080p HEVC inputs on M1 Max (190fps -> 407fps).

The rgb to yuv formula is currently hard-coded to premultiplied BT.709 matrix.

Changes

  • add faster code path for bgra overlay

Issues

gnattu avatar Jul 04 '24 19:07 gnattu