zephyr icon indicating copy to clipboard operation
zephyr copied to clipboard

zbus: Add Zephyr message bus

Open rodrigopex opened this issue 2 years ago • 20 comments

Pull request for adding zbus to Zephyr.

The Zephyr message bus - Zbus is a lightweight and flexible message bus enabling a simple way for threads to talk to one another. Threads can broadcast messages to all interested observers using zbus. Many-to-many communication is possible. The bus implements message-passing and publish/subscribe communication paradigms that enable threads to communicate synchronously or asynchronously through shared memory. The communication through zbus channel-based, where threads publish and read to and from using messages. Additionally, threads can observe channels and receive notifications when the channels are modified.

Zbus presentation slides

Zbus documentation

Features table

Characteristic zbus
Made for Increase code quality by enabling reuse and increasing testability
Metaphor Messaging bus
Message definition time Compile-time
Message definition approach Decentralized, developers can define channels in different files
Message allocation style Static (compile-time). It is possible to use dynamically allocated messages. See dyn_channel sample
Message persistency Persistent, it still exists after processed
Message distribution pattern Hybrid: Message-passing (listeners) and Publish/subscribe (subscribers)
Subscription time Compile-time (it can be disabled in execution time by masking the subscriber) and runtime registration is possible
Message transmission style Direct transmission when using callbacks style. Two-factor for asynchronous transmission where, first, the event dispatcher sends the id of the changed channel to the subscriber, and second the subscriber decides if it reads the content or not. The transmission order is defined by the position of the subscriber on the subscribing list
Observer execution styles Synchronous for listeners (by callback), asynchronous for subscribers (by queue)
Event dispatcher There is no centralized event dispatcher. The dispatcher logic is executed by the publisher thread context. It enables the execution in different priority contexts and avoids preemption of the centralized approach
Implementation approach Static memory, mutexes, sys_slists, and queues
Use code generation (tools/scripts) No, only macros
Extension mechanism (you can add your functionality) Yes (see uart_bridge and remote_mock samples)
Maturity Feature-complete

This pull request aims to add zbus (message bus/event manager) discussed at RFC #45910 to Zephyr. With the implemented tests, it was possible to reach 90%+ of line and 80%+ of branch coverage. All the APIs were designed to be like the Zephyr's. For example, the channel publish is almost the same as a message queue put function. There is a claim/finish like the ring buffer and so on. I guess zbus would help several developers to create even more reusable and decoupled code. The pull request commits are separated into feature code, tests, and samples. I have made individual commits to help reviewers.

Base feature code commit:

  • zbus: zbus: Add message bus subsystem to Zephyr

Zbus tests:

  • tests: zbus: Add zbus unittests
  • tests: zbus: Add zbus integration tests
  • tests: zbus: Add zbus user_data usage tests
  • tests: zbus: Add zbus dynamic channels tests
  • tests: zbus: Add zbus runtime observer registration tests

Zbus samples:

  • samples: zbus: Add the hello world sample to zbus
  • samples: zbus: Add work queue sample
  • samples: zbus: Add dynamic channels sample
  • samples: zbus: Add the UART bridge sample
  • samples: zbus: Add the remote mock sample
  • samples: zbus: Add runtime observer registration sample
  • samples: zbus: Add benchmark sample

Activities during the PR:

  • [x] Tests coverage (results: Line 93%, Branch 81%);
  • [x] Compliance issues, there are some, but I guess they are not related to the submission;
  • [x] General documentation;
  • [x] Samples documentation.

rodrigopex avatar Jul 31 '22 23:07 rodrigopex

@gmarull

Thanks for your contribution, I've done a first pass review.

Thank you for the comments.

My main concern is on how channels/messages are defined. Could you take a look at iterable sections? I think they would simplify things and would allow for a cleaner solution.

I will take a look at that. This is, in fact, a tricky part of the code.

rodrigopex avatar Aug 02 '22 14:08 rodrigopex

I have made some heavy changes to the way of generating channels. Now it is using Iterable Sections. The code is cleaner and smaller. I really good improvement. Thanks a lot, @gmarull, for the advice. There is a lot of work to do. I will address that in the next few days.

rodrigopex avatar Aug 03 '22 01:08 rodrigopex

@rodrigopex please make sure CI passes and that commit structure is ok (for example, https://github.com/zephyrproject-rtos/zephyr/pull/48509/commits/9adf191c0da7329a2a88ae6bbd77ea8bf0889b4b misses proper commit description)

gmarull avatar Aug 03 '22 09:08 gmarull

Sorry for that. I will do. Could you take a look at the new way of defining channels?

rodrigopex avatar Aug 03 '22 14:08 rodrigopex

@gmarull The issues I am facing with CI now seem to be something other them my submission. Checkpatch and clang-format are fighting each other. The warnings below I could not remove:

-:455: WARNING:SUSPECT_CODE_INDENT: suspect code indent for conditional statements (8, 8)
#455: FILE: include/zephyr/zbus/zbus.h:163:
+	FOR_EACH_NONEMPTY_TERM(_ZBUS_OBS_EXTERN, (;), _observers)                                 \
+	STRUCT_SECTION_ITERABLE(zbus_channel, _name) = {                                           \

-:457: WARNING:LONG_LINE: line length of 110 exceeds 100 columns
#457: FILE: include/zephyr/zbus/zbus.h:165:
+		.name = STRINGIFY(_name), .flag = {false, _on_changed, _read_only},                          \

- total: 0 errors, 2 warnings, 3718 lines checked

The building errors seem related to iterable sections. I could not replicate those locally.

: && ccache /opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/xtensa-intel_apl_adsp_zephyr-elf-gcc   zephyr/CMakeFiles/zephyr_pre0.dir/misc/empty_file.c.obj -o zephyr/zephyr_pre0.elf  -fuse-ld=bfd  -Wl,-T  zephyr/linker_zephyr_pre0.cmd  -Wl,-Map=/__w/zephyr/zephyr/twister-out/intel_adsp_cavs15/samples/subsys/zbus/hello_world/sample.zbus.hello_world/zephyr/zephyr_pre0.map  -Wl,--whole-archive  app/libapp.a  zephyr/libzephyr.a  zephyr/arch/common/libarch__common.a  zephyr/arch/arch/xtensa/core/libarch__xtensa__core.a  zephyr/arch/arch/xtensa/core/startup/libarch__xtensa__core__startup.a  zephyr/lib/libc/minimal/liblib__libc__minimal.a  zephyr/lib/posix/liblib__posix.a  zephyr/soc/xtensa/intel_adsp/common/libintel_adsp_common.a  zephyr/subsys/zbus/libsubsys__zbus.a  zephyr/drivers/interrupt_controller/libdrivers__interrupt_controller.a  zephyr/drivers/timer/libdrivers__timer.a  modules/xtensa/libmodules_xtensa_hal.a  -Wl,--no-whole-archive  zephyr/kernel/libkernel.a  zephyr/CMakeFiles/offsets.dir/./arch/xtensa/core/offsets/offsets.c.obj  -L"/opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/../lib/gcc/xtensa-intel_apl_adsp_zephyr-elf/10.3.0"  -L/__w/zephyr/zephyr/twister-out/intel_adsp_cavs15/samples/subsys/zbus/hello_world/sample.zbus.hello_world/zephyr  -lgcc  zephyr/arch/common/libisr_tables.a  zephyr/soc/xtensa/intel_adsp/common/libintel_adsp_common.a  -no-pie  -Wl,--gc-sections  -Wl,--build-id=none  -Wl,--sort-common=descending  -Wl,--sort-section=alignment  -Wl,-u,_OffsetAbsSyms  -Wl,-u,_ConfigAbsSyms  -nostdlib  -static  -Wl,-X  -Wl,-N  -Wl,--orphan-handling=warn  -Wl,--fatal-warnings && cd /__w/zephyr/zephyr/twister-out/intel_adsp_cavs15/samples/subsys/zbus/hello_world/sample.zbus.hello_world/zephyr && /usr/local/bin/cmake -E echo
/opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/../lib/gcc/xtensa-intel_apl_adsp_zephyr-elf/10.3.0/../../../../xtensa-intel_apl_adsp_zephyr-elf/bin/ld.bfd: warning: orphan section `._zbus_channel.static.acc_data' from `app/libapp.a(channels.c.obj)' being placed in section `._zbus_channel.static.acc_data'
/opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/../lib/gcc/xtensa-intel_apl_adsp_zephyr-elf/10.3.0/../../../../xtensa-intel_apl_adsp_zephyr-elf/bin/ld.bfd: warning: orphan section `._zbus_channel.static.version' from `app/libapp.a(channels.c.obj)' being placed in section `._zbus_channel.static.version'
/opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/../lib/gcc/xtensa-intel_apl_adsp_zephyr-elf/10.3.0/../../../../xtensa-intel_apl_adsp_zephyr-elf/bin/ld.bfd: warning: orphan section `._zbus_observer.static.my_subscriber' from `app/libapp.a(main.c.obj)' being placed in section `._zbus_observer.static.my_subscriber'
/opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/../lib/gcc/xtensa-intel_apl_adsp_zephyr-elf/10.3.0/../../../../xtensa-intel_apl_adsp_zephyr-elf/bin/ld.bfd: warning: orphan section `._zbus_observer.static.my_listener' from `app/libapp.a(main.c.obj)' being placed in section `._zbus_observer.static.my_listener'
/opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/../lib/gcc/xtensa-intel_apl_adsp_zephyr-elf/10.3.0/../../../../xtensa-intel_apl_adsp_zephyr-elf/bin/ld.bfd: zephyr/subsys/zbus/libsubsys__zbus.a(zbus.c.obj):(.literal.zbus_iterate_over_channels+0x0): undefined reference to `_zbus_channel_list_start'
/opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/../lib/gcc/xtensa-intel_apl_adsp_zephyr-elf/10.3.0/../../../../xtensa-intel_apl_adsp_zephyr-elf/bin/ld.bfd: zephyr/subsys/zbus/libsubsys__zbus.a(zbus.c.obj):(.literal.zbus_iterate_over_channels+0x4): undefined reference to `_zbus_channel_list_end'
/opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/../lib/gcc/xtensa-intel_apl_adsp_zephyr-elf/10.3.0/../../../../xtensa-intel_apl_adsp_zephyr-elf/bin/ld.bfd: zephyr/subsys/zbus/libsubsys__zbus.a(zbus.c.obj):(.literal.zbus_iterate_over_observers+0x0): undefined reference to `_zbus_observer_list_start'
/opt/toolchains/zephyr-sdk-0.14.2/xtensa-intel_apl_adsp_zephyr-elf/bin/../lib/gcc/xtensa-intel_apl_adsp_zephyr-elf/10.3.0/../../../../xtensa-intel_apl_adsp_zephyr-elf/bin/ld.bfd: zephyr/subsys/zbus/libsubsys__zbus.a(zbus.c.obj):(.literal.zbus_iterate_over_observers+0x4): undefined reference to `_zbus_observer_list_end'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.

The documentation is not ready yet and needs a lot of revision. I am working on that this week.

The code I guess it is in a stable version. Please take a look at that.

rodrigopex avatar Aug 08 '22 11:08 rodrigopex

Some nits but nothing really blocking

carlocaione avatar Aug 22 '22 08:08 carlocaione

@carlocaione

Some nits but nothing really blocking

The last push resolved all the comments.

rodrigopex avatar Aug 22 '22 13:08 rodrigopex

@pdgendt thank you for your review. ~~Soon, I will push changes related to the timing~~. And I am waiting for your thoughts about the early return situation.

Timing changed. Please take a look at that.

rodrigopex avatar Aug 24 '22 16:08 rodrigopex

@rodrigopex I just realized that you can also use CONFIG_ASSERT_TEST=y to do this.

yperess avatar Aug 29 '22 19:08 yperess

@yperess

@rodrigopex I just realized that you can also use CONFIG_ASSERT_TEST=y to do this.

I was trying to figure out how would I implement that using CONFIG_ASSERT_TEST=y but I could not think in a good way without interfering in other parts of the system. I want to suppress only the asserts inside the zbus code. I guess the current code is an acceptable solution. What do you think?

rodrigopex avatar Aug 29 '22 21:08 rodrigopex

@yperess

@rodrigopex I just realized that you can also use CONFIG_ASSERT_TEST=y to do this.

I was trying to figure out how would I implement that using CONFIG_ASSERT_TEST=y but I could not think in a good way without interfering in other parts of the system. I want to suppress only the asserts inside the zbus code. I guess the current code is an acceptable solution. What do you think?

Yes, what you have is perfectly acceptable. I just wanted to make sure you were aware of the config. You could still do as you wanted with CONFIG_ASSERT_TEST, but it would probably look odd (you'd basically need to check the file name in the assert_post_action and call kpanic if it was outside of your subsystem. Regardless, what you have works well for what you wanted to test :+1:

yperess avatar Aug 30 '22 04:08 yperess

@alevkoy I think you might be able to use this for your vbus design

yperess avatar Aug 30 '22 04:08 yperess

@yperess

Yes, what you have is perfectly acceptable. I just wanted to make sure you were aware of the config. You could still do as you wanted with CONFIG_ASSERT_TEST, but it would probably look odd (you'd basically need to check the file name in the assert_post_action and call kpanic if it was outside of your subsystem. Regardless, what you have works well for what you wanted to test :+1:

The issue with the current implementation is that it only does not blocks the rest of the execution. But it does tell nothing regarding the failure. I want to capture the failure in a zassert. Thanks for the comments. I will replace all zassert_true.

rodrigopex avatar Aug 30 '22 10:08 rodrigopex

Hey, did a pass on the sample code, added few comments. Some of those applies to other samples as well but I ran out of time so I'm sending what I have, spotted few others unnecessary automatic variable initializations that I think should be dropped. Could you make a pass on all sample files if you agree? It would also be nice to have coherent style for stuff like CMake files, prj.conf option grouping, headers etc... bit nitpicky but I think it helps when reading the various samples code.

I will take a look and revisit all of them. Thanks for the comments.

rodrigopex avatar Sep 01 '22 23:09 rodrigopex

Hey, did a pass on the sample code, added few comments. Some of those applies to other samples as well but I ran out of time so I'm sending what I have, spotted few others unnecessary automatic variable initializations that I think should be dropped. Could you make a pass on all sample files if you agree?

@fabiobaltieri I tried to solve the points you said in all samples and tests as well. I have added some auxiliary functions to access channels and observers' struct fields like zbus_cha_msg, zbus_chan_user_data, zbus_obs_name, etc.

rodrigopex avatar Sep 03 '22 13:09 rodrigopex

After installing the SDK version 0.15.0, I cannot generate the documentation or coverage info. Now, it is hard to check the code before pushing it.

rodrigopex avatar Sep 03 '22 13:09 rodrigopex

Added benchmark sample.

rodrigopex avatar Sep 05 '22 19:09 rodrigopex

@fabiobaltieri

Thanks for the followup, few more comments on the samples, more or less nitpicky. :-)

Would hold on adding more code for now, it's a big enough PR to review as it is.

No more code, I promess :)

rodrigopex avatar Sep 06 '22 23:09 rodrigopex

Hi, community. My baby son was born last Thursday. I will get back to work on this PR as soon as possible (next week, I guess). For now, I am taking care of my wife and son. Thanks.

rodrigopex avatar Sep 13 '22 18:09 rodrigopex

Got back to work! :muscle:

rodrigopex avatar Sep 20 '22 12:09 rodrigopex

Hey just spotted few unnecessary initialization to 0 for global variables, happy enough besides that, looks pretty polished to me. Good job!

Thanks a lot for your contributions @fabiobaltieri!

rodrigopex avatar Sep 27 '22 14:09 rodrigopex

I have reviewed all the documentation and changed the images to keep consistency. @anangl

rodrigopex avatar Oct 06 '22 12:10 rodrigopex

Hi all, as I did not submit the review, many messages were on hold.

rodrigopex avatar Oct 12 '22 16:10 rodrigopex

Major changes from the last push:

  1. Removed support for read-only channels;
  2. I did change a little the way of using user_data. I removed the data allocation from zbus code side and added an argument in the ZBUS_CHANNEL_DEFINE for adding the user_data pointer.

@anangl @desowin

rodrigopex avatar Oct 19 '22 16:10 rodrigopex

@anangl @desowin Is there anything I can do? This PR is waiting for your change request acceptance. Thanks.

rodrigopex avatar Nov 03 '22 11:11 rodrigopex

@anangl @desowin Is there anything I can do? This PR is waiting for your change request acceptance. Thanks.

Can you check on the failing CI run? It breaks on a couple of zbus specific tests.

fabiobaltieri avatar Nov 04 '22 09:11 fabiobaltieri

Can you check on the failing CI run? It breaks on a couple of zbus specific tests.

@fabiobaltieri, for some unknown reason, the same sample running on another RISCV boards is breaking at the hifive_unleashed. I have added that to the platform_exclude in the sample file.

The other CI issue is related to the _ZBUS_ASSERT, which has already been discussed before and is acceptable. I have submitted another revision. Let's see if it passes now.

rodrigopex avatar Nov 04 '22 19:11 rodrigopex

Thank you very much to everyone involved who gave suggestions and discussed the feature and pitfalls of the solution. I am happy to have that merged. Now let's search for bugs and fix them if they happen. Regards.

rodrigopex avatar Nov 15 '22 00:11 rodrigopex

@rodrigopex How did you create those insanely pretty SVGs? :eyes:

M1cha avatar Nov 03 '23 17:11 M1cha

@rodrigopex How did you create those insanely pretty SVGs? 👀

Thanks @M1cha. I used Figma. The assets and images are public. You can use that by follwing the link. Feel free to change. Regards.

rodrigopex avatar Nov 03 '23 17:11 rodrigopex