[BUG][LLEXT] Logging is impossible during module unloading
Describe the bug
When a pipeline is freed, all its components are freed too. If any of them is implemented by an LLEXT module and this is its last instance, the memory, where the module was stored in SRAM, will be unmapped. If the module performs any logging in its .reset() or .free() methods, the firmware will fail. We don't know exactly how it fails in such cases, there's no exception dump, but the GLB_DELETE_PIPELINE IPC will time out.
A likely reason is Zephyr logging subsystem accessing some of the data, registered using the LOG_MODULE_REGISTER() macro, from its low priority delayed processing thread. Supposedly modifying the logging system to only access that data during inline processing wouldn't be simple. Possible other (partial) solutions:
- Add comments to all those functions (simple but not very effective)
- Add a flag that effectively disables logging from the moment
.reset()is entered (probably simple, but only partially effective, because that would affect only SOF logging functions liketr_info(), Zephyr-native logging usingLOG_INF()and others would remain unaffected.
To Reproduce
Insert a logging call in an LLEXT module's .free() method and create and free a pipeline with it.
Reproduction Rate 100%.
Expected behavior At least the firmware shouldn't crash. In the best case logging should still work, even if in some reduced form.
Impact Currently the impact is low. It's trivial to remove all logging from those functions when converting them to modules. But it is inevitable that at some point this will be forgotten and somebody will try to print some logs in those functions, even if only during development or debugging.
Environment Current SOF "main"
Shouldn't Zephyr help with that?
I have a hunch that this might be fixed by https://github.com/zephyrproject-rtos/zephyr/pull/74568.
The current Zephyr does not correctly handle merged section relocations properly, and the logging subsys defines symbols in specific sections that are merged with .data/.rodata. They might therefore be linked at a wrong address, and result in this kind of issues.
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.