UnityPlugin-AVProVideo icon indicating copy to clipboard operation
UnityPlugin-AVProVideo copied to clipboard

[Windows+WMF] Unity audio output causes decode stalls when using segmented media formats

Open TapGhoul opened this issue 8 months ago • 1 comments

Unity version

2022.3.22f1

Unity editor platform

Windows

AVPro Video edition

Trial

AVPro Video version

2.8.5

Device hardware

Ryzen 9 5950X, NVIDIA GeForce RTX 3080 Ti, 32GB RAM

Which Windows version are you using?

11

Graphics API

D3D 11

Video API

Media Foundation

Audio output

Unity

Any other Media Player component configuration required to reproduce the issue.

None

Which output component(s) are you using?

Audio Output

Any other component configuration required to reproduce the issue.

None

The issue

When playing back fMP4-based HLS or MPEG-DASH, the audio buffer underruns every time a new segment is ready to play. After a lot of digging, it seems like when using the unity audio output mode, WMF is not decoding the next segment's audio (in my case, in aac) until it is time to present a frame from the segment in question - this causes the audio to stall in native code, resulting in an underrun - and thus audio hitching - in unity.

This does not occur when using the native output mode - it seems like WMF is correctly decoding ahead of time. For whatever reason though, WMF is not decoding audio in a segment it is not currently presenting when using the Unity output mode.

Some way to declare a custom decode buffer size/duration (separately to the download buffer) for WMF, however that is specified in its APIs, would likely help here. Though in reality anything that causes WMF to start decoding audio in the next segment before it's time to present it would work.

Note, despite this being against v2.8.5 trial, it also causes an issue in v2.8.5 full (seen in VRChat) as well as against the v3.2.4 trial version.

This may also reproduce on other platforms, but the only platform I have personally tested here (thus far) is Windows. WMF-specific settings such as "use low latency", "hardware decoding" and "use audio delay" have no impact on this, whether enabled or disabled.

Media information

https://bitdash-a.akamaihd.net/content/MI201109210084_1/m3u8s-fmp4/f08e80da-bf1d-4e3d-8899-f0f6155f6efa.m3u8 is a good demo - any other fragmented mp4 source, such as any fmp4-encoded HLS, or mpeg-dash works too.

The source does not need to be live, it just needs to be fmp4. Formats encoded with shorter segment sizes, segments encoded as being independent (closed GOP + split on keyframe) or HLS playlists marked with #EXT-X-DISCONTINUITY for every segment causes this issue to become far more prominent, and may make it easier to reproduce.

Log output

N/A - there is no log info here, as there is no error, you would need to manually instrument AudioOutputManager::RequestAudio() to see when AudioOutputManager::GrabAudio() returns false

TapGhoul avatar Mar 15 '25 02:03 TapGhoul