linux icon indicating copy to clipboard operation
linux copied to clipboard

provide interface to query loaded firmware and topology to userspace via debugfs

Open kv2019i opened this issue 2 years ago • 21 comments

As the logic has got more complicated to select the SOF firmware and topology filenames to load, it is getting more and more difficult to write robust test-cases that need to match test-content with the topology and firmware loaded.

To make test case development easier and more robust, the kernel should expose the firmware and topologies that have been loaded via debugfs or sysfs.

Example discussion in sof-test: https://github.com/thesofproject/sof-test/pull/956#discussion_r970169343

@marc-hb EDIT: this "example" discussion is actually a MUST READ FIRST. It's about deciding what logging solution to use based on the firmware type.

kv2019i avatar Sep 19 '22 10:09 kv2019i

Thanks @kv2019i for initiating. This helps in case of Chrome too as we have multiple diffferent path for FW and TPLG as below

eg1 : intel/sof/community/ eg2 : intel/sof/<board_name>/

We have multiple topology for same board name too:

eg1: intel/sof/board1/board1_2mic.tplg eg2: intel/sof/board1/board1_4mic.tplg

So i was thinking we can either (a) We get FW and TPLG path if we enable debug logs, the same print can be made dev_info - this might be easier but prone to extra log ( Also during long duration automation, the log might have been overwritten

Better solution is to provide as debugfs

So the FW and TPLG path gets recorded ( we can pick the path when its about to load) to ensure we have all the suffix covered.

sathyap-chrome avatar Sep 19 '22 10:09 sathyap-chrome

@sathyap-chrome the Chrome case is inextricable, since you have overlays the name of the files can be ambiguous and point to different topologies or firmware files. You will need a MD5 signature to identify the files I am afraid....

plbossart avatar Sep 19 '22 12:09 plbossart

@plbossart The place where we request the final FW or TPLG to load has all the path information including the overlay names to be added. I can do some experiments and share more insights tomorrow.

sathyap-chrome avatar Sep 19 '22 14:09 sathyap-chrome

Hopefully short digression: the https://www.kernel.org/doc/html/latest/driver-api/firmware/request_firmware.html API should really provide a list of all firmware names (not just audio FW and tplg) that were requested and successfully loaded somewhere in /sys/. A checksum would be ideal. Unlike the audio (or other) driver it could also report about /lib/firmware/updates/ etc.

This seems like a very basic security requirement to me (even more basic that "Tainted!") but what do I know. Right now it does not even log the names by default.

I typically add a hack like this:

--- a/drivers/base/firmware_loader/main.c
+++ b/drivers/base/firmware_loader/main.c
@@ -552,7 +552,7 @@ fw_get_filesystem_firmware(struct device *device, struct fw_priv *fw_priv,
                size = rc;
                rc = 0;
 
-               dev_dbg(device, "Loading firmware from %s\n", path);
+               dev_warn(device, "MARC Loading firmware from %s\n", path); 
                if (decompress) {
                        dev_dbg(device, "f/w decompressing %s\n",
                                fw_priv->fw_name);
@@ -868,6 +868,10 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
                fw = NULL;
        }
 
+       dev_warn(device, "MARC %s firmware=%s, ret=%d\n",
+                __func__, name, ret);
+
        *firmware_p = fw;
        return ret;
 }

marc-hb avatar Sep 23 '22 04:09 marc-hb

this enhancement request will be parked until we have more clarity on the ask.

plbossart avatar Sep 28 '22 13:09 plbossart

@plbossart Isn't the sof-test need quite clear -> https://github.com/thesofproject/sof-test/pull/956#discussion_r970169343

kv2019i avatar Sep 28 '22 13:09 kv2019i

no it's not @kv2019i. I have no context and no desire to read pages of mtrace-related threads.

plbossart avatar Sep 28 '22 13:09 plbossart

Let me retry then @plbossart . Maybe we need to also deepdive into the generic request_firmware() and why the information is not logged.

kv2019i avatar Sep 28 '22 16:09 kv2019i

The kernel simply needs to tell userspace whether it loaded sof-tgl.ri or community/sof-tgl.ri or cavs-whatever.dsp. Same for the topology file. That's all, no need to read any sof-test pull request.

Anything still unclear?

marc-hb avatar Sep 28 '22 16:09 marc-hb

Maybe we need to also deepdive into the generic request_firmware() and why the information is not logged.

No, parsing logs is slow and unreliable when switching ipc_type or using some overrides (sof-test already does it in some places).

marc-hb avatar Sep 28 '22 17:09 marc-hb

The kernel simply needs to tell userspace whether it loaded sof-tgl.ri or community/sof-tgl.ri or cavs-whatever.dsp. Same for the topology file. That's all, no need to read any sof-test pull request.

Anything still unclear?

@marc-hb dmesg already has this information doesnt it?

ranj063 avatar Sep 28 '22 17:09 ranj063

dmesg is a ring buffer in non-persistent RAM. So it wraps around and loses any information very quickly. dmesg is convenient in interactive use but totally useless in scripts (it's also missing userspace logs, timestamps,... I digress). Let's forget dmesg. Rephrasing your question:

@marc-hb journalctl -b already has this information doesnt it?

Yes and as I just mentioned above (we posted at almost the same time) we already have code in some sof-test places that tries to parse journalctl. However this parsing has bugs and limitations (e.g: switching ipc_type, not failing after unloading the driver, dependency on the log level, lack of test developers and shell expertise, ...), hence this request.

marc-hb avatar Sep 28 '22 17:09 marc-hb

However this parsing has bugs and limitations (e.g: switching ipc_type, not failing after unloading the driver, dependency on the log level, lack of test developers and shell expertise, ...), hence this request.

I am not aware of any bugs or request to fix the existing solution. Seriously we've provided this firmware name forever and there's been no ask for any change for a very long time, and now we need a TDB interface to fix everything. Allow me to be reasonably suspicious on this one. We have to prioritize and that feels like a low low priority to me.

plbossart avatar Sep 29 '22 08:09 plbossart

Here's the latest example of how brittle and hard to maintain is the current solution:

  • https://github.com/thesofproject/sof-test/pull/973

There are many others in the sof-test git log and there will be more.

The brittleness of parsing debug logs seems pretty obvious to me (while 973 does something different, I can't even remember why. Good luck)

Seriously we've provided this firmware name forever

I'm lost sorry: who provided what?

and there's been no ask for any change for a very long time,

IPC4 / dsp_basefw.bin is the straw that breaks the camel's back.

marc-hb avatar Oct 20 '22 17:10 marc-hb

there's no dependency on options @marc-hb, the logs tell you exactly what was done.

490a625b017758 (Pierre-Louis Bossart 2020-01-07 10:08:40 -0600  43)             dev_dbg(sdev->dev, "request_firmware %s successful\n",
490a625b017758 (Pierre-Louis Bossart 2020-01-07 10:08:40 -0600  44)                     fw_filename);

plbossart avatar Oct 20 '22 18:10 plbossart

Chiming in from #3819. A checksum or full path of the active fw/tplg in /sys would definitively help us. Partners working with us were confused by whether or not the udev rules we use to override the paths are working. cat /sys/module/snd_sof_*/parameters/* does help, but the path is relative. And if the path is not overridden, it says null, in that case we need to dive back to the code to see what the defaults are, which is a task not everyone working with the firmware files is capable of.

afq984 avatar Oct 21 '22 02:10 afq984

Partners working with us were confused by whether or not the udev rules we use to override the paths are working.

BTW https://superuser.com/questions/677106/how-to-check-if-a-udev-rule-fired/ (again: logs)

marc-hb avatar Oct 21 '22 16:10 marc-hb

@kv2019i @marc-hb @aiChaoSONG we've got

with SOF: [    3.394615] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: Loaded firmware library: ADSPFW, version: 2.0.0.1

with cavs_fw [ 3.432765] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: Loaded firmware library: ADSPFW, version: 10.29.0.5399

in the journalctl to make the decision between whether it is SOF or not. We can even writ this base FW library version to the fw_version debugfs entry for sof-test to use. Would that be enough?

Im not sure why the SOF library version is 2.0.0.1. It should really be 4.0.0.* but nevertheless the cavs version isnt going to change. It always seems to being with major 10.

ranj063 avatar Oct 22 '22 00:10 ranj063

[version numbers] in the journalctl to make the decision between whether it is SOF or not.

If we can have some firm promise, "user-ABI-like" commitment that these version numbers can be relied upon then sure, why not use them.

Im not sure why the SOF library version is 2.0.0.1.

Off to a great start! :-) Sorry, could not resist.

We can even writ this base FW library version to the fw_version debugfs entry for sof-test to use. Would that be enough?

debugfs will always have my preference compared to a slow, buggy and brittle logs parser. However what sof-test or similar need the most right now is a tiny bit of thought and design and some final decisions. The decision process in sof-test today is a patchwork of inconsistent hacks piled next to or on top of each other and sof-test#973 looks like the straw breaking this camel's back.

cc: @dbaluta , @andyross

marc-hb avatar Oct 22 '22 00:10 marc-hb

Summarizing a short private chat "stolen" from @plbossart's busy schedule: while he didn't have the time to really understand the need (yet?), he has no strong opposition to some careful debugfs additions on principle.

marc-hb avatar Oct 26 '22 22:10 marc-hb

/sys/module/snd_sof_*/parameters/* ... And if the path is not overridden, it says null,..

@plbossart just told we could change that. I will look into it.

marc-hb avatar Nov 03 '22 22:11 marc-hb

I'd still like to do (some of) this and I think it's more needed than ever (see all backlinks above) but I'm soon going away for a long vacation until Sept 2023. Unassigning myself.

marc-hb avatar Jul 17 '23 15:07 marc-hb

fw_profile is brand new and still a bit buggy (e.g. https://github.com/thesofproject/sof-test/pull/1178) bit it's there and it works.

marc-hb avatar Apr 23 '24 22:04 marc-hb