linux icon indicating copy to clipboard operation
linux copied to clipboard

5.10.63 causes 1000s of lines of errors relating to hdmi-audio-codec hdmi-audio-codec.2.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -19

Open graysky2 opened this issue 3 years ago • 47 comments

Describe the bug

  1. When kodi starts, dmesg is polluted with thousands of lines like this:
hdmi-audio-codec hdmi-audio-codec.2.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -19
  1. If I start my RPi4 with the attached TV off, the RPi4 does not detect that an external monitor is connected and the only way to enable HDMI output is to reboot the RPi4 with the TV on.

Downgrading to 5.10.59-1 fixes both issues. There are only 5 lines in dmesg displaying that error. That kernel is based on this commit: https://github.com/raspberrypi/linux/commit/22201d4868f66aa7773d5c4bad7bab1795ee6757

To reproduce

  • Install 5.10.63-1 on Arch ARM aarch64 (https://github.com/raspberrypi/linux/commit/597cd63b48c6dc3746827504f191744c120fbb4d)
  • Start kodi

Expected behaviour

  1. dmesg should not be spammed with 1000s of lines
  2. HDMI detection should work regardless of TV on/off state

Actual behaviour

  1. dmesg is spammed with 1000s of lines.
  2. HDMI detection is broken relative to 5.10.59

System Copy and paste the results of the raspinfo command in to this section. Alternatively, copy and paste a pastebin link, or add answers to the following questions:

  • Which model of Raspberry Pi? RPi4B Rev 1.4
  • Which OS and version? Arch ARM aarch64
  • Which firmware version (vcgencmd version)?
Aug 31 2021 14:52:47 
Copyright (c) 2012 Broadcom
version 67615e950e1e28b92dfae6303cf7a8b879a8908f (clean) (release) (start)
  • Which kernel version (uname -a)? Linux treetop 5.10.63-1-ARCH #1 SMP PREEMPT

Logs If applicable, add the relevant output from dmesg or similar.

Additional context Add any other relevant context for the problem.

graysky2 avatar Sep 11 '21 01:09 graysky2

I submitted a RFC patch last year, but it wasn't accepted because the issue must be solved differently: https://www.spinics.net/lists/dri-devel/msg284535.html

More recent discussion: https://www.spinics.net/lists/alsa-devel/msg127150.html

lategoodbye avatar Sep 11 '21 08:09 lategoodbye

I find it odd that under the older kernel, I do not see this problem. To me something changed between those two kernel commits. I verified this on a separate RPi4 connected to a different monitor.

  • Booting into the kernel built off this commit is fine: https://github.com/raspberrypi/linux/commit/22201d4868f66aa7773d5c4bad7bab1795ee6757
  • Booting into the kernel build off this commit is not: https://github.com/raspberrypi/linux/commit/597cd63b48c6dc3746827504f191744c120fbb4d

I can run a bisect but before I do that work/time I wanted to get some feedback. Thanks.

graysky2 avatar Sep 11 '21 11:09 graysky2

A bisect would be useful. https://github.com/raspberrypi/linux/pull/4538 would be my first guess (although I'm only seeing the usual couple of dmesg entries with a kernel with that in).

popcornmix avatar Sep 11 '21 12:09 popcornmix

I will git it a shot but I am having an issue getting the bisect to track the rpi/5.10.y branch. It seems to be using another branch.

% git status
On branch rpi-5.10.y
Your branch is up to date with 'origin/rpi-5.10.y'.

nothing to commit, working tree clean

% git bisect start
% git bisect bad 597cd63b48c6dc3746827504f191744c120fbb4d
% git bisect good 22201d4868f66aa7773d5c4bad7bab1795ee6757
Bisecting: 174 revisions left to test after this (roughly 8 steps)
[8f499a90e7eecafd44e8206a3ab586d024930485] mmc: mmci: stm32: Check when the voltage switch procedure should be done

When I built this commit the boot process froze. Usually I see 4x raspberry pi logos when booting, but this kernel was 4 tux logos. Am I some how jumping branches?

graysky2 avatar Sep 11 '21 12:09 graysky2

Yes, the upstream merge commits (e.g. Linux 5.10.60 do complicate things). You can just do a git bisect --skip for a commit upstream. Or manually check the commits after each Linux 5.10.60 type commit and then bisect between them.

Someone with more knowledge may know a better way.

popcornmix avatar Sep 11 '21 13:09 popcornmix

Google has failed me. Running git bisect skip seems to be stepping back one at a time but after 5 itterations, I am still unable to find the hash git bisect wants me to build in the rpi-5.10.y tree. Must be a better way to do this staying on the rpi-5.10.y branch.

graysky2 avatar Sep 11 '21 13:09 graysky2

My workaround has been to write a script to revert all of the commits in question, cherry-pick the originals, and bisect through the cherry-picks.

pelwell avatar Sep 11 '21 13:09 pelwell

@pelwell - Do you have some code you can share for this case?

good: https://github.com/raspberrypi/linux/commit/22201d4868f66aa7773d5c4bad7bab1795ee6757 bad: https://github.com/raspberrypi/linux/commit/597cd63b48c6dc3746827504f191744c120fbb4d

Alternatively, how can I just get an ordered list of the all commits between those two points? I could manually do the bisect with simple checkouts and manually notes.

EDIT: no, this doesn't work either.

% git rev-list 2201981d674d..HEAD|wc -l                        
9119

There are not 9,119 commits between the last-good and HEAD staying on the branch.

graysky2 avatar Sep 11 '21 14:09 graysky2

I updated the first post in this Issue just now to detail further breakage. Note that reverting the PR @popcornmix called out (assuming I did it correctly, see these two patches and provide feedback), did not fix the issue.

0001-Revert-drm-vc4-hdmi-Actually-check-for-the-connector.patch.txt 0002-Revert-drm-probe-helper-Create-a-HPD-IRQ-event-helpe.patch.txt

graysky2 avatar Sep 11 '21 15:09 graysky2

There are a few commits in the range that don't want to revert without intervention, but fortunately I think we can ignore them.

Try this:

$ glo --no-merges 22201d4..597cd63 | cut -d' ' -f1 | grep -v -E "(2566c1d|8849a8c)" | xargs -n 1 git revert --no-edit
$ glo --no-merges 22201d4..597cd63 | cut -d' ' -f1 | grep -v -E "(2566c1d|8849a8c)" | tac | xargs -n 1 git cherry-pick

There are 344 commits that have been reverted and reapplied (git branch -vv should show us to be 688 commits ahead of upstream) , so we can:

$ git bisect start
$ git bisect good HEAD~344
$ git bisect bad HEAD

pelwell avatar Sep 11 '21 16:09 pelwell

glo? Is that a wrapper around git?

popcornmix avatar Sep 11 '21 16:09 popcornmix

Sorry, yes -- it's muscle memory for me. The full version is:

$ git log --oneline --no-merges 22201d4..597cd63 | cut -d' ' -f1 | grep -v -E "(2566c1d|8849a8c)" | xargs -n 1 git revert --no-edit
$ git log --oneline --no-merges 22201d4..597cd63 | cut -d' ' -f1 | grep -v -E "(2566c1d|8849a8c)" | tac | xargs -n 1 git cherry-pick

pelwell avatar Sep 11 '21 16:09 pelwell

https://github.com/graysky2/linux-1/tree/try Trying now...

graysky2 avatar Sep 11 '21 16:09 graysky2

79aec5148d54db32f92da3232325411d5d3a8916 is the first bad commit
commit 79aec5148d54db32f92da3232325411d5d3a8916
Author: Maxime Ripard <[email protected]>
Date:   Tue May 11 17:05:11 2021 +0200

    drm/vc4: hdmi: Remove the DDC probing for status detection

Log

% git bisect log
git bisect start
# good: [b4b4599db2e70ac200201de62cc65aeb1e21e0ec] Revert "iio: adc: ti-ads7950: Ensure CS is deasserted after reading channels"
git bisect good b4b4599db2e70ac200201de62cc65aeb1e21e0ec
# bad: [9c080eb267f8a42e11c5def69575d7362503c860] gpio-fsm: Clamp the delay time to zero
git bisect bad 9c080eb267f8a42e11c5def69575d7362503c860
# bad: [fa2d25b0314a9880b6c340d7fb6abaf348167bc3] ovs: clear skb->tstamp in forwarding path
git bisect bad fa2d25b0314a9880b6c340d7fb6abaf348167bc3
# good: [b23bec822620daa5667cc5b421e75e5ab6cc4246] powerpc/smp: Fix OOPS in topology_init()
git bisect good b23bec822620daa5667cc5b421e75e5ab6cc4246
# bad: [a994014c7b08bdc5d00ca643451f66b5c80ee7c8] dmaengine: of-dma: router_xlate to return -EPROBE_DEFER if controller is not yet available
git bisect bad a994014c7b08bdc5d00ca643451f66b5c80ee7c8
# good: [ce311708d967b4fb51e7780b2deea8448901d2f5] drm/probe-helper: Create a HPD IRQ event helper for a single connector
git bisect good ce311708d967b4fb51e7780b2deea8448901d2f5
# bad: [a0504af09349b0bfcbe386cb88e39e9426dc0578] media: zr364xx: fix memory leaks in probe()
git bisect bad a0504af09349b0bfcbe386cb88e39e9426dc0578
# bad: [2dd6c17c378d951bf9c864d978744fc75534ab57] ath9k: Clear key cache explicitly on disabling hardware
git bisect bad 2dd6c17c378d951bf9c864d978744fc75534ab57
# bad: [79aec5148d54db32f92da3232325411d5d3a8916] drm/vc4: hdmi: Remove the DDC probing for status detection
git bisect bad 79aec5148d54db32f92da3232325411d5d3a8916
# good: [14db98aca2e26b56b42d05785b9b2567f54d1c86] drm/vc4: hdmi: Actually check for the connector status in hotplug
git bisect good 14db98aca2e26b56b42d05785b9b2567f54d1c86
# first bad commit: [79aec5148d54db32f92da3232325411d5d3a8916] drm/vc4: hdmi: Remove the DDC probing for status detection

graysky2 avatar Sep 11 '21 17:09 graysky2

That commit from my branch maps to https://github.com/raspberrypi/linux/commit/17a554471341cabeb583feb02f04eb785d7d9bf8

0001-Revert-drm-vc4-hdmi-Remove-the-DDC-probing-for-statu.patch

When I reverted it from HEAD, the bug is resolved:

  1. No more spamming dmseg (only 5 entries).
  2. TV is detected even though it is off when booting.
% uname -a
Linux treetop 5.10.63-3-ARCH #1 SMP PREEMPT Sat Sep 11 13:31:02 EDT 2021 aarch64 GNU/Linux
% dmesg | tail
[  +0.000026] random: 7 urandom warning(s) missed due to ratelimiting
[  +0.027801] Console: switching to colour frame buffer device 240x67
[  +0.020980] vc4-drm gpu: [drm] fb0: vc4drmfb frame buffer device
[  +3.515289] bcmgenet fd580000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[  +0.000078] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[  +3.593794] hdmi-audio-codec hdmi-audio-codec.3.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -19
[  +0.007797] hdmi-audio-codec hdmi-audio-codec.3.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -19
[  +0.004113] hdmi-audio-codec hdmi-audio-codec.3.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -19
[  +0.001735] hdmi-audio-codec hdmi-audio-codec.3.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -19
[  +0.001877] hdmi-audio-codec hdmi-audio-codec.3.auto: ASoC: error at snd_soc_dai_startup on i2s-hifi: -19

graysky2 avatar Sep 11 '21 17:09 graysky2

Ping @mripard

popcornmix avatar Sep 11 '21 18:09 popcornmix

Do you have a prebuilt image (or some way to generate one) I can grab to reproduce this bug ?

mripard avatar Sep 14 '21 10:09 mripard

I don't... it's just Arch ARM + kodi-rpi package. If it is helpful for you, I could tar it up and you could just extract that tarball to prepared SD card. Let me know.

graysky2 avatar Sep 14 '21 10:09 graysky2

@graysky2 I'm not seeing this issue with LibreELEC running kodi on 5.10.63. So there may be something unique to the specific arch versions of kodi/kernel, some settings you've selected within kodi or your display/edid. Do you have the ability to upload the whole sdcard (e.g. dd and gzip)? Otherwise instructions starting from a clean Arch install that are confirmed to show the problem would be useful.

I assume you have one hdmi display connected? And to HDMI0 (next to power)? Are you using 4kp60?

popcornmix avatar Sep 14 '21 10:09 popcornmix

I am happy to spin up a fresh instance and create an image. I never used dd to do it but will google it.

Yes, one HDMI connected to HDMI0 but not 4kp60. The monitor is 1080p HDTV.

graysky2 avatar Sep 14 '21 11:09 graysky2

As I test, I took the uSD card from the RPi4 connected to the HDTV on which I see this bug and moved it to another RPi4 which is connected to a different monitor, all other settings were the same, I do not see the bug.

When I put the uSD back into the original RPi4 connected to the HDTV and boot, I see the bug.

This is consistent with @popcornmix experience and hypothesis that it is my-setup-specific. Very annoying.

graysky2 avatar Sep 14 '21 15:09 graysky2

If it's display specific I'd expect it's related to edid or hotplug behaviour. Can you use the edid from good display on bad (and/or vice versa) and see if issue follows the edid? (e.g. using drm.edid_firmware in cmdline.txt).

popcornmix avatar Sep 14 '21 15:09 popcornmix

Or capture the EDID and post it here for analysis and testing with.

6by9 avatar Sep 14 '21 15:09 6by9

Yes and Yes. Just need some time.

graysky2 avatar Sep 14 '21 15:09 graysky2

One thing worth testing too would be to double-check the cables and connectors. Before that patch was merged, it was relying on fetching the EDID as the primary mean to detect whether the display was connected or not. If this doesn't work anymore, it hints at the fact that the hotplug detect line stays (or is read) low for some reason

mripard avatar Sep 14 '21 15:09 mripard

@mripard - Connections are tight. Is there a test to verify that fetching was successful?

graysky2 avatar Sep 14 '21 15:09 graysky2

I'd try to swap things around :)

You have the setup A with a Pi4, cable and TV that doesn't work, and a setup B with a Pi4, cable and monitor that works.

I'd try to test Pi4 B with the cable B on the display A. If it still doesn't work, it's an issue related to the TV. If it works, try the Pi4 B with the cable A on the display A. If it doesn't work, the issue is the cable. If it works, the issue is the Pi4 A.

mripard avatar Sep 14 '21 15:09 mripard

I extracted the EDID from the monitor that did not exhibit the bug and placed it on the setup exhibiting the bug. Booting with it fixes the bug.

Here is the EDID from the "good" monitor:

# base64 /sys/devices/platform/gpu/drm/card1/card1-HDMI-A-1/edid       
AP///////wBBDI/BFAYAABkdAQOAPCJ4KmehpVVNoicOUFS/7wDRwLMAlQCBgIFAgcABAQEBTdAA
oPBwPoAwIDUAVVAhAAAao2YAoPBwH4AwIDUAVVAhAAAaAAAA/ABQSEwgMjc2RThWCiAgAAAA/QAX
UB6gPAAKICAgICAgAQsCAzPxTJAEAx8TARJdXl9gYSMJBweDAQAAbQMMABAAOHggAGABAgNn2F3E
AXiAA+MPAAxWXgCgoKApUDAgNQBVUCEAAB4COoAYcTgtQFgsRQBVUCEAAB4BHQByUdAeIG4oVQBV
UCEAAB5NbICgcHA+gDAgOgBVUCEAABoAAAAATg==

Here is the EDID extracted from the HDTV causing the bug:

# base64 /sys/devices/platform/gpu/drm/card1/card1-HDMI-A-1/edid
AP///////wAebQEAAQEBAQEZAQOAoFp4Cu6Ro1RMmSYPUFShCAAxQEVAYUBxQIGAAQEBAQEBAjqA
GHE4LUBYLEUAQIRjAAAeZiFQsFEAGzBAcDYAQIRjAAAeAAAA/QA6Ph5TEAAKICAgICAgAAAA/ABM
RyBUVgogICAgICAgAeYCAxnxSJAiIAUEAwIBIwlXB2cDDAAgAIAeAjqAGHE4LUBYLAQFQIRjAAAe
AR2AGHEcFiBYLCUAQIRjAACeAR0AclHQHiBuKFUAQIRjAAAejArQiiDgLRAQPpYAQIRjAAAYAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+A==

EDIT: I should mention that booting with the EDID "good" from the other monitor renders the TV not outputing anything. It indicates "invalid mode."

graysky2 avatar Sep 14 '21 16:09 graysky2

@mripard @popcornmix - I think the experiment swapping the EDIDs was a bit of a red herring. Turns out that if I simply extract the EDID from the HDTV and load it via /boot/cmdline.txt, I am able to run a kernel based on https://github.com/raspberrypi/linux/commit/3cae59f7aeffaf19313022eac4a92cb6cf3081ee without reverting the commit I found from the bisect. Could that mean something is wrong with EDID detection somehow?

For the record, on the RPi4 setup that originally triggered this bug:

# cp /sys/devices/platform/gpu/drm/card1/card1-HDMI-A-1/edid /lib/firmware/edid.dat
# cat /boot/cmdline.txt 
root=/dev/mmcblk0p2 rw rootwait console=serial0,115200 console=tty1 selinux=0 plymouth.enable=0 smsc95xx.turbo_mode=N dwc_otg.lpm_enable=0 kgdboc=serial0,115200 drm.edid_firmware=edid.dat video=HDMI-A-1:D

graysky2 avatar Sep 14 '21 17:09 graysky2

I think the "D" at end of video= forces hotplug, which I think you are saying fixes the issue? (I'm a bit confused by your final comment that suggests the pi4 that triggered the bug has the edid= and video= bits on cmdline. Or are you saying it no longer has the bug with those added?) Try without the edid= part.

popcornmix avatar Sep 14 '21 19:09 popcornmix