alsa-lib icon indicating copy to clipboard operation
alsa-lib copied to clipboard

Strange trigger_htstamp after snd_pcm_drop

Open z-s-e opened this issue 1 year ago • 20 comments

I have conceptually the following situation (actual code can be found here):

// sound is currently playing
snd_pcm_status(pcm, pcm_status);
snd_pcm_status_get_htstamp(pcm_status, &ts1);
snd_pcm_drop(pcm);
snd_pcm_status(pcm, pcm_status);
snd_pcm_status_get_trigger_htstamp(status, &ts2);

On my hardware I get a ts2 timestamp that is earlier than ts1. From my understanding that should be impossible, as ts2 should contain the timestamp of when the drop happened, which should obviously be after ts1. Is that a bug or do I misunderstand the meaning of those timestamps?

z-s-e avatar Mar 07 '24 13:03 z-s-e

It looks like a bug. Could you show the plugin chain (using snd_pcm_dump or so) ? Or are you using direct hw:X device?

perexg avatar Mar 07 '24 13:03 perexg

I'm using front:CARD=Generic_1,DEV=0

snd_pcm_dump gives:

Soft volume PCM
Control: PCM Playback Volume
min_dB: -51
max_dB: 0
resolution: 256
Its setup is:
  stream       : PLAYBACK
  access       : RW_INTERLEAVED
  format       : S32_LE
  subformat    : STD
  channels     : 2
  rate         : 48000
  exact rate 

z-s-e avatar Mar 07 '24 14:03 z-s-e

There should be more information (lines after Slave:).

perexg avatar Mar 07 '24 14:03 perexg

ah yeah, sorry.

Soft volume PCM
Control: PCM Playback Volume
min_dB: -51
max_dB: 0
resolution: 256
Its setup is:
  stream       : PLAYBACK
  access       : RW_INTERLEAVED
  format       : S32_LE
  subformat    : STD
  channels     : 2
  rate         : 48000
  exact rate   : 48000 (48000/1)
  msbits       : 32
  buffer_size  : 96000
  period_size  : 48000
  period_time  : 1000000
  tstamp_mode  : ENABLE
  tstamp_type  : MONOTONIC
  period_step  : 1
  avail_min    : 48000
  period_event : 0
  start_threshold  : 9223372036854775807
  stop_threshold   : 96000
  silence_threshold: 0
  silence_size : 0
  boundary     : 6755399441055744000
Slave: Hardware PCM card 1 'HD-Audio Generic' device 0 subdevice 0
Its setup is:
  stream       : PLAYBACK
  access       : MMAP_INTERLEAVED
  format       : S32_LE
  subformat    : STD
  channels     : 2
  rate         : 48000
  exact rate   : 48000 (48000/1)
  msbits       : 32
  buffer_size  : 96000
  period_size  : 48000
  period_time  : 1000000
  tstamp_mode  : ENABLE
  tstamp_type  : MONOTONIC
  period_step  : 1
  avail_min    : 48000
  period_event : 0
  start_threshold  : 9223372036854775807
  stop_threshold   : 96000
  silence_threshold: 0
  silence_size : 0
  boundary     : 6755399441055744000
  appl_ptr     : 96000
  hw_ptr       : 47178

z-s-e avatar Mar 07 '24 14:03 z-s-e

So it's basically equivalent to hw:X device so it appears like a kernel issue, but the kernel changes the trigger tstamp only if the state changes. It may be possible that the stream was already stopped for a reason (underrun or so). Could you check the stream state for the first snd_pcm_status call?

perexg avatar Mar 07 '24 14:03 perexg

The code is called directly from a switch statement that checks for the state here, so this code gets only called when the last update is in SND_PCM_STATE_RUNNING but I also just re-checked it by verifying the value of snd_pcm_status_get_state right before the drop.

z-s-e avatar Mar 07 '24 14:03 z-s-e

Is trigger timestamp identical before and after drop?

perexg avatar Mar 07 '24 14:03 perexg

Good call, indeed they are the same (I assume you meant after the drop+update_status).

z-s-e avatar Mar 07 '24 14:03 z-s-e

Well anyway, thanks for clarifying this is a kernel bug.

I can create a bug over at bugzilla.kernel.org if you want, but otherwise I guess I close this then as it is not a alsa-lib bug.

z-s-e avatar Mar 08 '24 13:03 z-s-e

I'd like to keep this open until resolved.

perexg avatar Mar 11 '24 12:03 perexg

Some more info: This happens on my integrated audio (ALC287 Analog) as well as when outputting to hdmi. I've also tested a usb interface, there it does not happen.

z-s-e avatar Mar 19 '24 09:03 z-s-e

@perexg I do not have a kernel dev setup yet to test/verify, but looking at the code I might have an idea of what the problem is: It seems the hda_controller sets the trigger_tstamp_latched boolean at stream start, but it never gets reset. From my understanding the following may fix this:

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 4057f9f10aee..7ddc4dc9829d 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -1196,6 +1196,7 @@ static void snd_pcm_trigger_tstamp(struct snd_pcm_substream *substream)
        if (runtime->trigger_master == substream) {
                if (!runtime->trigger_tstamp_latched)
                        snd_pcm_gettime(runtime, &runtime->trigger_tstamp);
+               runtime->trigger_tstamp_latched = false;
        } else {
                snd_pcm_trigger_tstamp(runtime->trigger_master);
                runtime->trigger_tstamp = runtime->trigger_master->runtime->trigger_tstamp;

Edit: Or alternatively, if the intention of that flag is to never get reset and make the driver be always responsible for the timestamps, this:

diff --git a/sound/pci/hda/hda_controller.c b/sound/pci/hda/hda_controller.c
index 5d86e5a9c814..2e7e3f73af3d 100644
--- a/sound/pci/hda/hda_controller.c
+++ b/sound/pci/hda/hda_controller.c
@@ -277,6 +277,8 @@ static int azx_pcm_trigger(struct snd_pcm_substream *substream, int cmd)
        snd_hdac_stream_sync_trigger(hstr, false, sbits, sync_reg);
        if (start)
                snd_hdac_stream_timecounter_init(hstr, sbits);
+       else
+               snd_pcm_gettime(hstr->substream->runtime, &hstr->substream->runtime->trigger_tstamp);
        spin_unlock(&bus->reg_lock);
        return 0;
 }

z-s-e avatar Jul 23 '24 10:07 z-s-e

I can now confirm that my idea was correct. Attached is a reproducer, it can be verified under qemu with an ich9-intel-hda device.

Actually at first my fix didn't seem to work, but it turned out that the default alsa device contained dmix here, which does not call the SNDRV_PCM_IOCTL_DROP of the underlying device when dropping, so with dmix the timestamps are still broken even with my patch, just as a heads-up.

I had one more idea for an alternate fix; just remove this trigger_tstamp_latched stuff. The hda_controller is the only driver using that, and while I understand the timestamp under the spinlock there can potentially be more accurate, I am not so sure it is worth it. In qemu I measured around 300 nanoseconds difference between the timestamp taken in the driver vs the timestamp in snd_pcm_trigger_tstamp, so ~~unless one gets unlucky and there is some preemption in-between~~ it should be negligible (EDIT 2: after reading the code some more, there cannot be any preemption as the stream lock is held, so that bus->reg_lock spinlock seems entirely unrelated to the timestamp).

Anyway, I'd like to get started contributing to the kernel, so this seems like a good opportunity. Could you tell me which fix you prefer, so I can prepare a patch for the ML?

z-s-e avatar Aug 09 '24 21:08 z-s-e

I'll just go ahead with proposing the removal. That appears to me to be the most robust solution, and personally I am always in favor of removing unneeded complexity.

z-s-e avatar Aug 12 '24 13:08 z-s-e

Latched timestamps were added in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2b79d7a6bf341ebeffb85fc2b3577d0cfa810c49 . Just fix the trigger callback in hda_controller.c.

perexg avatar Aug 12 '24 13:08 perexg

Yes, I've seen that commit already. Just adding that to the hda_controller might not fix it for other uses of snd_hdac_stream_timecounter_init (which is the place that raises that flag, and grepping shows it is also called by sound/soc/intel/skylake/skl-pcm.c), so that seems like a more risky fix that potentially needs to be kept in mind for all hda drivers, that is why I rather would remove it, especially since it does not seem to provide any real value.

z-s-e avatar Aug 12 '24 13:08 z-s-e

Please, sent a proposed commit with a good description to [email protected] mailing list with Cc to all relevant maintainers of affected drivers (including sound/soc/intel tree). Here is not a right place to discuss this removal.

perexg avatar Aug 12 '24 13:08 perexg

Yep, thanks.

z-s-e avatar Aug 12 '24 13:08 z-s-e

BTW: Another way may be to add a new start argument to snd_hdac_stream_timecounter_init and do just timestamp when this argument is false. This will catch all hda_controller code uses.

perexg avatar Aug 12 '24 13:08 perexg

Right. Personally I still prefer the removal, but if the consensus on the ML will be to not remove this after all, I'll change it to the agreed solution then.

z-s-e avatar Aug 12 '24 13:08 z-s-e

Resolved in https://github.com/tiwai/sound/commit/df5215618fbe425875336d3a2d31bd599ae8c401 .

perexg avatar Oct 02 '24 14:10 perexg