RIOT
RIOT copied to clipboard
cpu/nrf5x_common: fix uart_poweroff()
Contribution description
Previously, uart_poweroff() and uart_poweron() were no-ops. This replaces them with the logic to indeed power on and power off the UART device.
Testing procedure
- ideally, power consumption should be less after a call to
uart_poweroff()- beware: this is untested. however, at least a call to
uart_write()while powered off would get stuck if not for the special handling, while before the UART remained fully operational when "powered off"
- beware: this is untested. however, at least a call to
- regular operation should resume after again calling
uart_poweron()- this is tested
Issues/PRs references
None
Murdock results
:heavy_check_mark: PASSED
de99c64d0b66c676cd15fa787919d93ea01b2bab cpu/nrf5x_common: fix uart_poweroff()
| Success | Failures | Total | Runtime |
|---|---|---|---|
| 10083 | 0 | 10083 | 14m:30s |
Artifacts
bors merge
bors cancel
The build failure seemingly was caused by this PR :-/
Canceled.
Probably not worth backporting but I always like fixes, Is it still WIP?
To test this PR, I modified the tests/periph/uart test a little bit and added significantly longer delays so a slow human (me) can read off the multimeter in time. This is from the tests/periph/uart/main.c file, lines 203ff:
static void sleep_test(int num, uart_t uart)
{
printf("UARD_DEV(%i): test uart_poweron() and uart_poweroff() -> ", num);
uart_poweroff(uart);
//xtimer_usleep(POWEROFF_DELAY);
xtimer_msleep(4000);
uart_poweron(uart);
xtimer_msleep(4000);
puts("[OK]");
}
When the UART is powered off, the nRF52832 on my nRF52DK consumes 0.474mA and with the UART powered on it consumes 0.530mA. This is measured against Nordics recommendation when the board is powered via USB. They say that powering it via USB can introduce unwanted noise, but for this static test it should be okay. The measurements were taken on the "nRF current measurement" headers, so it does not measure the consumption of the JLink it's blinking LED. The multimeter is an HP 34401A, so it should be precise enough. In the sleep phases, the measurements are very stable as well.
IMO this PR is well worth backporting, especially since most of the changes regarding the register accesses are already done in the mainline code, so this PR should shrink to only a handful of lines changed.
OK, rebased. I don't have the hardware at hands, so this is yet untested. The looks reasonable after resolving the merge conflict, though.
I can't add that as a comment, but you can delete the ENABLE_ON and ENABLE_OFF macros from lines 44, 45, 52 and 53. They are now unused.
This is the expansion of the test code in tests/periph/uart:
static void sleep_test(int num, uart_t uart)
{
printf("UARD_DEV(%i): test uart_poweron() and uart_poweroff() -> ", num);
xtimer_msleep(5);
uart_poweroff(uart);
uint8_t data[8] = "asdf";
uart_write(uart, data, sizeof(data)); // this should not output any data but not hang either
//xtimer_usleep(POWEROFF_DELAY);
xtimer_msleep(4000);
uart_poweron(uart);
xtimer_msleep(4000);
puts("[OK]");
}
The additional 5ms sleep is there to make sure the data is sent before the UART goes to sleep. The uart_write function returns as expected and does not result in a deadlock, the current consumption is reduced as expected as well.
This looks good from my side, thank you for addressing this so quickly :)
I can't add that as a comment, but you can delete the ENABLE_ON and ENABLE_OFF macros from lines 44, 45, 52 and 53. They are now unused.
Indeed. But I wonder if those were actually more readable than what I did when mergeing :rofl:
I was so free to squash directly and I changed the code to instead make use of ENABLE_ON and the ENABLE_OFF macro, rather than re-inventing the wheel.
I agree that using ENABLE_ON and ENABLE_OFF is more elegant. I can test the latest version tomorrow, but I wouldn't expect any problems.
One more thing that might be worth looking at are Lines 140-143:
#ifndef UARTE_PRESENT
/* only the legacy non-EasyDMA UART needs to be powered on explicitly */
dev->POWER = 1;
#endif
The "dev->POWER = 1;" line is present in set_power() as well, so one of the two places is probably redundant.
Was the change of the bmp.py file supposed to smuggle into the PR?
Thx, good catch!
It appears that this breaks examples/default on the nrf52840dk when UART is used for stdio. Hard-wiring get_power() to return true unconditionally fixes the regression, but causes uart_write() to deadlock when the UART is powered off :cry:
That is odd, stdio_uart does not care about the return value of uart_write and uart_write is the only function that calls get_power. I don't have a nRF52840DK at home, so I'll have to try it out tomorrow.
Could it be that UARTE_PRESENT is not set anywhere in RIOT currently? You introduced it in #20102, but it's not defined anywhere: https://github.com/search?q=repo%3ARIOT-OS%2FRIOT%20UARTE_PRESENT&type=code (or maybe I'm blind).
But if that's the case, the registers for the nRF52 UARTE are not set corretly.
Edit: Sorry that this became such a can of worms, that was certainly not intended 👀
The define is in the vendor header files:
cpu/nrf53/include/vendor/nrf5340_application_peripherals.h
212:#define UARTE_PRESENT
cpu/nrf52/include/vendor/nrf52820_peripherals.h
202:#define UARTE_PRESENT
cpu/nrf52/include/vendor/nrf52832_peripherals.h
227:#define UARTE_PRESENT
cpu/nrf52/include/vendor/nrf52840_peripherals.h
237:#define UARTE_PRESENT
cpu/nrf52/include/vendor/nrf52811_peripherals.h
195:#define UARTE_PRESENT
cpu/nrf53/include/vendor/nrf5340_network_peripherals.h
159:#define UARTE_PRESENT
cpu/nrf52/include/vendor/nrf52805_peripherals.h
194:#define UARTE_PRESENT
cpu/nrf52/include/vendor/nrf52833_peripherals.h
239:#define UARTE_PRESENT
cpu/nrf52/include/vendor/nrf52810_peripherals.h
188:#define UARTE_PRESENT
cpu/nrf9160/include/vendor/nrf9160_peripherals.h
147:#define UARTE_PRESENT
But even without UARTE_PRESENT it should work, as the non Easy-DMA variant of the UART peripheral is a subset of the UARTE peripheral. Hence, old code for nRF51 is supposed to just work.
In fact, for the nRF52832 we previously did so, likely because it was just overlooked that this MCU does have the UARTE.
The crash I have is in the nrf802154 thread early on. Disabling the network device also fixes the crash. It looks pretty much like a memory corruption issue. I have the feeling that the real bug is present in master already and I just happen to be the lucky one to trigger it.
I don't understand why the Github search does not find these files, but hey, it would've been too easy I guess.
Maybe it is just stack related? When get_power is forced to return true, the compiler likely optimizes it away, resulting in one less function on the stack. The Makefile does not increase the stack size, but a lot of stuff is going on, so this is not unlikely. I had to increase the stack size for a some examples (the filesystem example for example) while tinkering with the nRF52 and not in every case the typical memory runout message appeared.
A stack overflow should actually be detected by the MPU stack guard. Unless of course that happened too early on.
According to the UFSR it is an INVSTATE fault some time between event_queues_init() and msg_init_queue() in gnrc_netif.c:1946 and the statement below.
Interestingly: If step with stepi, the hard fault is not triggered and the app works fine. Disabling IRQs during that call to event_queues_init() also fixes the issue.
I wonder if there is a race condition with the radio sending events to the event queue, but the event queue not yet fully set up.
I'll take another look tomorrow. I'm too tired now to investigate.
It probably won't help if I say "works for me" :eyes: The result is the same without the USEMODULE and it works with "BOARD=nrf52840dk make flash term" as well.
chris@ThinkPias:~/flashdev-riot/RIOT/examples/default$ USEMODULE+=stdio_uart BOARD=nrf52840dk make hexfile -j12
"make" -C /home/chris/flashdev-riot/RIOT/pkg/cmsis/
.......
chris@ThinkPias:~/flashdev-riot/RIOT/examples/default$ nrfjprog --recover && nrfjprog --chiperase --verify --program bin/nrf52840dk/default.hex && nrfjprog --pinresetenable && nrfjprog --pinreset && BOARD=nrf52dk make term
Recovering device. This operation might take 30s.
Erasing user code and UICR flash areas.
Writing image to disable ap protect.
[ #################### ] 0.205s | Erase file - Done erasing
[ #################### ] 0.303s | Program file - Done programming
[ #################### ] 0.309s | Verify file - Done verifying
Enabling pin reset.
Applying pin reset.
/home/chris/flashdev-riot/RIOT/dist/tools/pyterm/pyterm -p "/dev/ttyACM0" -b "115200"
Twisted not available, please install it if you want to use pyterm's JSON capabilities
2024-05-07 11:06:35,079 # Connect to serial port /dev/ttyACM0
Welcome to pyterm!
Type '/exit' to exit.
2024-05-07 11:06:36,085 # NETOPT_TX_END_IRQ not implemented by driver
2024-05-07 11:06:36,086 # main(): This is RIOT! (Version: 2024.07-devel-141-gde99c6-cpu/nrf5x_common/periph_uart)
2024-05-07 11:06:36,087 # Welcome to RIOT!
> reboot
2024-05-07 11:06:44,520 # reboot
2024-05-07 11:06:44,525 # NETOPT_TX_END_IRQ not implemented by driver
2024-05-07 11:06:44,533 # main(): This is RIOT! (Version: 2024.07-devel-141-gde99c6-cpu/nrf5x_common/periph_uart)
2024-05-07 11:06:44,534 # Welcome to RIOT!
>
I can't really say if the board is actually sending something, but it looks convincing:
2024-05-07 11:10:14,069 # NETOPT_TX_END_IRQ not implemented by driver
2024-05-07 11:10:14,070 # main(): This is RIOT! (Version: 2024.07-devel-141-gde99c6-cpu/nrf5x_common/periph_uart)
2024-05-07 11:10:14,070 # Welcome to RIOT!
> ps
2024-05-07 11:10:18,325 # ps
2024-05-07 11:10:18,334 # pid | name | state Q | pri | stack ( used) ( free) | base addr | current
2024-05-07 11:10:18,343 # - | isr_stack | - - | - | 512 ( 164) ( 348) | 0x20000000 | 0x200001c8
2024-05-07 11:10:18,352 # 1 | main | running Q | 7 | 1536 ( 692) ( 844) | 0x200002b0 | 0x20000794
2024-05-07 11:10:18,360 # 2 | pktdump | bl rx _ | 6 | 1472 ( 176) ( 1296) | 0x2000115c | 0x2000166c
2024-05-07 11:10:18,369 # 3 | nrf802154 | bl anyfl _ | 2 | 896 ( 308) ( 588) | 0x20000a88 | 0x20000d4c
2024-05-07 11:10:18,375 # | SUM | | | 4416 ( 1340) ( 3076)
> ifconfig
2024-05-07 11:10:29,875 # ifconfig
2024-05-07 11:10:29,881 # Iface 3 HWaddr: 75:74 Channel: 26 NID: 0x23 PHY: O-QPSK
2024-05-07 11:10:29,885 # Long HWaddr: 5E:07:8F:1E:EE:26:F5:74
2024-05-07 11:10:29,887 # State: IDLE
2024-05-07 11:10:29,892 # ACK_REQ L2-PDU:102 Source address length: 2
2024-05-07 11:10:29,895 # Link type: wireless
2024-05-07 11:10:29,896 #
> txtsnd 3 bcast "asdf"
2024-05-07 11:10:37,823 # txtsnd 3 bcast "asdf"
For some reason though the ble subsystem is not added for the nRF52840DK and I did not figure out why. This is what the log looks like for the nRF52DK:
2024-05-07 11:24:45,133 # main(): This is RIOT! (Version: 2024.07-devel-141-gde99c6-cpu/nrf5x_common/periph_uart)
2024-05-07 11:24:45,134 # Welcome to RIOT!
>help
2024-05-07 11:24:52,558 # help
2024-05-07 11:24:52,561 # Command Description
2024-05-07 11:24:52,565 # ---------------------------------------
2024-05-07 11:24:52,569 # ble Manage BLE connections for NimBLE
2024-05-07 11:24:52,573 # ifconfig Configure network interfaces
2024-05-07 11:24:52,578 # pm interact with layered PM subsystem
2024-05-07 11:24:52,583 # ps Prints information about running threads.
2024-05-07 11:24:52,587 # reboot Reboot the node
2024-05-07 11:24:52,592 # saul interact with sensors and actuators using SAUL
2024-05-07 11:24:52,599 # txtsnd Sends a custom string as is over the link layer
2024-05-07 11:24:52,603 # version Prints current RIOT_VERSION
But neither of these things have anything to do with what you're seeing I guess? The nRF52DK does not seem to receive anything I send with the nRF52840DK, but that might be because I don't know what I'm doing with the Default example :sweat_smile:
But neither of these things have anything to do with what you're seeing I guess?
No. I also fail to reproduce the issue on a different nrf52840-dk (one that has "nRF52840-Preview-DK" on the silkscreen where the affected board only has "nRF52840-DK"). Maybe I should check the errata.
The board I have here is a PCA10056 Revision 3.0.1 from 2023.6. The nRF52840 has the following text written on it:
N52840
QIAAF0
2232AQ
QIAAF0 means that the chip is Revision 3. We have some chips here with Revision CKAAD0, which is Revision 2. But they are not on a Devboard, but integrated in another project.
OK, here is what I have found so far:
- The issue will only trigger on some hardware, but reliably so there. It may apply to all nRF52840 MCUs of that revision, or be "silicon lottery" - I don't know
- It requires GCC 13.2.0 and newlib 4.3.020230120 and will only trigger on
examples/default. Specifically, even the same version of newlib rebuild will no longer trigger it, but specifically the revision 3 of that package in Alpine is affected - It is timing sensitive. Adding a few nops will "fix" the issue
- It is triggered within the
puts()implementation of newlib duringLOG_WARNING("NETOPT_TX_END_IRQ not implemented by driver\n") - In the hard fault handler the program counter that triggered the fault points somewhere into
__sf, which is__FILE __sf[3];- so it is data not machine code
I'm almost certain it is an issue with RIOT not implementing the locks newlib requires to correctly implement reentrant functions, but only protects malloc() and friends from data races. I guess that this needs to be addressed now.
In any case, I'm 99.999% certain that the issues are unrelated to this PR.