linux icon indicating copy to clipboard operation
linux copied to clipboard

Pipewire ends up in xrun loop with IPC4

Open ujfalusi opened this issue 1 year ago • 13 comments

It has been reported [1] [2] and confirmed [3] that on MTL+ some application can render audio broken because pipewire ends up in a constant xrun loop. Pipewire can even think that xrun happens on already running stream which otherwise would not have any problems. The issue can be reproduced easily by forcing PW to open multiple (all) ALSA PCM devices at once:

Start a web browser and play youtube video or play audio (video is better as it will show the video stuttering) A) start WebEx standalone application then open the Settings -> Audio Click on the Test for Ringers and alerts (make sure that All Devices is selected), let it run and then press Stop The audio (and video) is going stutter

B) start pavucontrol if audio remains OK, close it and wait for PW to close the extra PCM device (use [4] for monitoring) repeat until audio got stuttering

From kernel log it can be seen that when audio got broken all PCMs are rapidly stopped, prepared, started - a standard xrun sequence - but PW will ends up in a loop.

The reason might be that PW is not aware how SOF audio works (which is kind of similar to USB audio): SOF uses the DSP to reduce system power consumption and thus the host side DMA (the hw_ptr) is not running continuously, it is 'jumping' in bursts. With IPC4 the SOF stack reports the 'delay' that can be used by user space to gain more insights on the progress of the audio.

[1] https://github.com/thesofproject/sof/issues/9695#issuecomment-2569033847 [2] https://github.com/thesofproject/sof/issues/9695#issuecomment-2569629604 [3] https://github.com/thesofproject/sof/issues/9695#issuecomment-2575494520 [4] watch 'for pcm in /proc/asound/card*/pcm*; do echo ${pcm}; cat ${pcm}/sub0/status; done'

Cc: @lgirdwood, @kv2019i, @ranj063, @bardliao, @lvanderree, @carlinigraphy

ujfalusi avatar Jan 08 '25 07:01 ujfalusi

One short term workaround is to switch from pipewire to pulseaudio.

ujfalusi avatar Jan 08 '25 07:01 ujfalusi

@lvanderree , @carlinigraphy, can you try this (thank for @kv2019i for digging this out!): Create ~/.config/wireplumber/wireplumber.conf.d/50-alsa-config.conf with the following content:

monitor.alsa.rules = [
  {
    # Matches all SOF alsa sinks
    matches = [
      {
        api.alsa.card.name = "~sof-*"
      },
      {
        node.name = "~alsa_output.*"
      }
    ]
    actions = {
      update-props = {
        api.alsa.headroom = 1024
      }
    }
  }
]

This will force PW to add space between the hw and sw pointer. As I have tried to explain, the hw pointer is jumping in case of SOF due to the use of DSP, it is possible that it might jumps over the sw ptr, which is an xrun situation.

ujfalusi avatar Jan 08 '25 07:01 ujfalusi

Filed a pipewire issue for this at https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4489 . It would seem in default settings, Pipewire is trying to restart streaming with a very small amount of data (128 frames when period size is set to 1024), which seems problematic with SOF devices that may transfer multiple milliseconds worth of audio in bursts. Let's keep this ticket open as well until we have confirmed the rootcause.

kv2019i avatar Jan 08 '25 13:01 kv2019i

@ujfalusi, I've created the config file, and will report back after some testing.

carlinigraphy avatar Jan 08 '25 15:01 carlinigraphy

Filed a pipewire issue for this at https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4489 . It would seem in default settings, Pipewire is trying to restart streaming with a very small amount of data (128 frames when period size is set to 1024), which seems problematic with SOF devices that may transfer multiple milliseconds worth of audio in bursts. Let's keep this ticket open as well until we have confirmed the rootcause.

Oh - that's 2.6ms at 48kHz for xrun restart. DMA could grab 2ms of this to fill the DSP buffer on trigger(start) leaving 32 frames (or 0.6ms) in the driver buffer which may not be expected by pipewire (pipewire may expect 1ms or less to be consumed at restart).

lgirdwood avatar Jan 08 '25 16:01 lgirdwood

Hard to say if the config file change above "fixed" things. I've been messing around with BT devices and pulseaudio settings for the last few hours without a crash though.

carlinigraphy avatar Jan 08 '25 19:01 carlinigraphy

@kv2019i, @lgirdwood, I have gathered some numbers without the headroom set to 1024 and it is interesting and might just give some hints. Using WebEx method in both cases. IPC3 Opening audio tab starts the PCM0 playback and capture [ 64.664105] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 1) [ 64.666715] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 144 [ 64.669375] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 288 [ 64.672044] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 384 [ 64.674717] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 528 Pressing the Test starts other PCMs Pressing Stop causes an xrun and the PCM0 playback is restarted: [ 69.767491] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 0) [ 69.769697] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 1) [ 69.771356] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 96 [ 69.771383] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 96 [ 69.772806] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 144 [ 69.775514] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 288 [ 69.778160] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 384 [ 69.780806] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: platform pointer: 528

IPC4 Opening audio tab starts the PCM0 playback and capture [ 191.851583] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 1) [ 191.854677] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: pcm_ops pointer: 240 [ 191.854704] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: pcm_ops pointer: 240 [ 191.856162] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: pcm_ops pointer: 288 [ 191.858836] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: pcm_ops pointer: 432 [ 191.861509] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: pcm_ops pointer: 528 Pressing the Test starts other PCMs Pressing Stop causes an xrun and the PCM0 playback is restarted and the xrun loop starts: [ 199.882117] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 0) [ 199.906534] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 1) [ 199.909007] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: pcm_ops pointer: 192 [ 199.909013] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 0) [ 199.933611] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 1) [ 199.936412] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: pcm_ops pointer: 192 [ 199.936417] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 0) [ 199.967739] snd_sof:sof_pcm_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: Entry: trigger (cmd: 1) [ 199.970411] snd_sof:sof_pcm_pointer: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm0 (HDA Analog), dir 0: pcm_ops pointer: 192

The DMA jumps in both cases, but with IPC3 at the first pointer call the hw_ptr is 96 while with IPC4 it moved further to 192. If PW only placed 128 samples then IPC4 is in xrun. 192 is 4ms (48KHz) worth of samples

If I apply the headroom change to 1024 then these xruns are not triggered anymore in the first place, so no xrun loop is happening. I guess when WebEx is stopped then PW tries to do rewind or something which collides with the batched DMA if the sw and hw pointers are kept too close?

But this explains why IPC3 appears to work OK and IPC4 ends up in xrun.

ujfalusi avatar Jan 09 '25 08:01 ujfalusi

@lvanderree , @carlinigraphy, can you try this (thank for @kv2019i for digging this out!): Create ~/.config/wireplumber/wireplumber.conf.d/50-alsa-config.conf with the following content:

monitor.alsa.rules = [
  {
    # Matches all SOF alsa sinks
    matches = [
      {
        api.alsa.card.name = "~sof-*"
      },
      {
        node.name = "~alsa_output.*"
      }
    ]
    actions = {
      update-props = {
        api.alsa.headroom = 1024
      }
    }
  }
]

This will force PW to add space between the hw and sw pointer. As I have tried to explain, the hw pointer is jumping in case of SOF due to the use of DSP, it is possible that it might jumps over the sw ptr, which is an xrun situation.

I did some initial tests with this config, after restoring PW PulseAudio (on PipeWire 1.2.7) and performed the tests via WebEx, and everything looks stable this time! I am currently not connected to my HDMI speakers, nor have I made calls via WebEx, so once I have done that I will come back with an update, but it looks promising.

lvanderree avatar Jan 09 '25 09:01 lvanderree

@kv2019i and indeed, if I set the headroom to 240 (5ms), I don't see xruns with IPC4.

ujfalusi avatar Jan 09 '25 09:01 ujfalusi

Update: setting headroom to 1ms (48) also works since it will prevent the initial xrun which would be triggering the flood of xruns.

ujfalusi avatar Jan 10 '25 06:01 ujfalusi

@udaymb @sathya-nujella fyi

lgirdwood avatar Jan 10 '25 15:01 lgirdwood

if the conf does not work, the lua script might (~/.config/wireplumber/main.lua.d/51-sof.lua):

rule = {
  matches = {
    {
      { "node.name", "matches", "alsa_output.*" },
      { "api.alsa.card.name", "matches", "sof-*" },
    },
  },
  apply_properties = {
      ["api.alsa.headroom"]             = 1024
  },
}
table.insert(alsa_monitor.rules,rule)

ujfalusi avatar Aug 20 '25 07:08 ujfalusi

Upstream merge request: https://gitlab.freedesktop.org/pipewire/wireplumber/-/merge_requests/740

ujfalusi avatar Sep 18 '25 05:09 ujfalusi