ieee802154/submac: change submac fsm logic
Convert submac.c state machine to prevent race conditions
Overview
This PR converts the state machine in the submac.c module to follow the same design pattern as implemented in DSMEFSM.h.
Problem
The current state machine implementation is susceptible to race conditions and error states, particularly when events are scheduled unnecessarily by other threads during the bh_request/process procedure.
Solution
The new state machine design:
- Prevents race conditions by implementing a more robust event handling mechanism
- Enables multi-state processing within a single cycle, improving efficiency
- Eliminates unnecessary event scheduling that could lead to inconsistent states
- Provides better thread safety for the
bh_request/processworkflow
Benefits
- More stable and reliable FSM operation
- Reduced potential for error states
- Better performance through multi-state cycle processing
- Improved thread safety and concurrent access handling
Technical Changes
- Removed
bh_request/bh_processmechanism - Implemented direct FSM state transitions
- Added multi-state cycle processing capability
- Adjust other parts according to the new logic
- Follows proven design patterns from
DSMEFSM.hfor consistency
Testing procedure
- GNRC Ping Test with 2 nRF52840 Devices
cd RIOT/examples/networking/gnrc/gnrc_networking
ping fe80::b8f0:7f3f:bb50:3c2a -s 1024 -i 300 -c 10000
2025-07-02 18:40:52,528 # --- fe80::b8f0:7f3f:bb50:3c2a PING statistics ---
2025-07-02 18:40:52,534 # 10000 packets transmitted, 9867 packets received, 1% packet loss
2025-07-02 18:40:52,538 # round-trip min/avg/max = 126.999/168.469/301.789 ms
- Submac Test with 2 nRF52840 Devices
2025-07-02 19:05:23,984 # txtsnd FE:B2:91:30:A6:FA:7C:B3 64
> 2025-07-02 19:05:23,991 # Tx complete
2025-07-02 19:05:23,990 # DATA
2025-07-02 19:05:23,995 # Dest. PAN: 0x0023, Dest. addr.: fe:b2:91:30:a6:fa:7c:b3
2025-07-02 19:05:24,000 # Src. PAN: 0x0023, Src. addr.: ba:f0:7f:3f:bb:50:34:2a
2025-07-02 19:05:24,005 # Security: 0, Frame pend.: 0, ACK req.: 1, PAN comp.: 1
2025-07-02 19:05:24,007 # Version: 1, Seq.: 1
2025-07-02 19:05:24,014 # 00000000 4C 6F 72 65 6D 20 69 70 73 75 6D 20 64 6F 6C 6F
2025-07-02 19:05:24,022 # 00000010 72 20 73 69 74 20 61 6D 65 74 2C 20 63 6F 6E 73
2025-07-02 19:05:24,029 # 00000020 65 63 74 65 74 75 72 20 61 64 69 70 69 73 63 69
2025-07-02 19:05:24,036 # 00000030 6E 67 20 65 6C 69 74 2E 20 45 74 69 61 6D 20 6F
2025-07-02 19:05:24,045 # txt (64 chars): Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam o
2025-07-02 19:05:24,045 # RSSI: 93, LQI: 48
Issues/PRs references
Murdock results
:heavy_check_mark: PASSED
8b09692a2a70b961ed6a4bd10d6d7e686dd1e8cf ieee802154/submac: change submac fsm logic,
| Success | Failures | Total | Runtime |
|---|---|---|---|
| 10930 | 0 | 10931 | 11m:59s |
Artifacts
This needs a rebase
@jia200x you can restart failed Murdock build jobs in the CI as well:
Either click on the small arrow on the right and select "Restart"...
...or click on the circular arrow on the upper right side in the job page:
You have to be logged in there though, but you can log in with Github :)
I'm not sure why the test build keeps timing out in CI, but when I tried building it locally with make -C tests/net/socket_zep, I encountered the following error:
RIOT/cpu/native/socket_zep/socket_zep.c:244:29: error: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (3 chars into 2 available) [-Werror=unterminated-string-initialization]
244 | .hdr.preamble = "EX",
| ^~~~
Replacing "EX" with {'E', 'X'} fixes the issue for me and the build completes successfully
On my machine the tests/socker_zep fails with a SegFault with and without llvm alike
Build Logs
buechse@skyleaf:~/RIOTstuff/riot-vanillaice/RIOT$ make BOARD=native64 TOOLCHAIN=llvm -C tests/net/socket_zep all test
make: Entering directory '/home/buechse/RIOTstuff/riot-vanillaice/RIOT/tests/net/socket_zep'
Building application "tests_socket_zep" for "native64" with CPU "native".
"make" -C /home/buechse/RIOTstuff/riot-vanillaice/RIOT/boards
...
"make" -C /home/buechse/RIOTstuff/riot-vanillaice/RIOT/sys/ztimer
/usr/bin/ld: warning: /home/buechse/RIOTstuff/riot-vanillaice/RIOT/tests/net/socket_zep/bin/native64/tests_socket_zep.elf has a LOAD segment with RWX permissions
text data bss dec hex filename
59420 1640 59904 120964 1d884 /home/buechse/RIOTstuff/riot-vanillaice/RIOT/tests/net/socket_zep/bin/native64/tests_socket_zep.elf
/home/buechse/RIOTstuff/riot-vanillaice/RIOT/tests/net/socket_zep/bin/native64/tests_socket_zep.elf --eui64=00:5a:45:50:0a:00:30:38 -z [127.0.0.1]:44221,[127.0.0.1]:59437 tap0
RIOT native interrupts/signals initialized.
TZ not set, setting UTC
RIOT native64 board initialized.
RIOT native hardware initialization complete.
main(): This is RIOT! (Version: 2026.01-devel-62-g0c8717-pr/link_layer/ieee802154/submac)
Socket ZEP device driver test
Initializing socket ZEP with (local: [127.0.0.1]:44221, remote: [127.0.0.1]:59437)
(Hwaddrs: 3038, 005a45500a003038)
Send 'Hello\0World\0'
Segmentation fault (core dumped)
make[1]: *** [/home/buechse/RIOTstuff/riot-vanillaice/RIOT/Makefile.include:877: cleanterm] Error 139
buechse@skyleaf:~/RIOTstuff/riot-vanillaice/RIOT$ make BOARD=native64 -C tests/net/socket_zep all test
make: Entering directory '/home/buechse/RIOTstuff/riot-vanillaice/RIOT/tests/net/socket_zep'
Building application "tests_socket_zep" for "native64" with CPU "native".
"make" -C /home/buechse/RIOTstuff/riot-vanillaice/RIOT/boards
...
"make" -C /home/buechse/RIOTstuff/riot-vanillaice/RIOT/sys/ztimer
/usr/bin/ld: warning: /home/buechse/RIOTstuff/riot-vanillaice/RIOT/tests/net/socket_zep/bin/native64/tests_socket_zep.elf has a LOAD segment with RWX permissions
text data bss dec hex filename
64203 1640 60168 126011 1ec3b /home/buechse/RIOTstuff/riot-vanillaice/RIOT/tests/net/socket_zep/bin/native64/tests_socket_zep.elf
/home/buechse/RIOTstuff/riot-vanillaice/RIOT/tests/net/socket_zep/bin/native64/tests_socket_zep.elf --eui64=00:5a:45:50:0a:00:30:38 -z [127.0.0.1]:39843,[127.0.0.1]:46441 tap0
RIOT native interrupts/signals initialized.
TZ not set, setting UTC
RIOT native64 board initialized.
RIOT native hardware initialization complete.
main(): This is RIOT! (Version: 2026.01-devel-62-g0c8717-pr/link_layer/ieee802154/submac)
Socket ZEP device driver test
Initializing socket ZEP with (local: [127.0.0.1]:39843, remote: [127.0.0.1]:46441)
(Hwaddrs: 3038, 005a45500a003038)
Send 'Hello\0World\0'
Segmentation fault (core dumped)
make[1]: *** [/home/buechse/RIOTstuff/riot-vanillaice/RIOT/Makefile.include:877: cleanterm] Error 139
That would explain the timeout.
should be fixed now
Actually I discovered an issue while testing with the nrf52: The submac end up in an invalid state. This issue is introduced by PR #21533. In that PR, some assert statements were removed, which is why the problem didn't occur there. I assume this happens because the submac is polling for confirm_transmit and the radio calls the callback for tx_done and then also calls this function, so some race condition happens. I didn't trace the chain that much at that time. Of course, I could also remove the assert statements and ignore the invalid state, but this feels wrong. You can reproduce the issue by setting the interval of the ping commands slightly above the average. You can also try it in master by adding prints at submac.c:460. Then I got some prints but the invalid state is just ignored there