STM32CubeH7
STM32CubeH7 copied to clipboard
Ethernet: avoid descriptor leak with enabled timestamping
Describe the set-up
- The board (either ST RPN reference or your custom board): NUCLEO-H745ZI-Q
- IDE or at least the compiler and its version: arm-none-eabi-gcc (GNU Arm Embedded Toolchain 10-2020-q4-major) 10.2.1 20201103 (release)
Describe the bug
Ethernet HAL configures interrupts to be received upon filling every RX descriptor. When timestamping is enabled, application needs to process both normal and context descriptors, but there is a race condition in HAL_ETH_IsRxDataAvailable()
that might miss context descriptor, effectively leaking it because they won't be accounted for in HAL_ETH_BuildRxDescriptors()
.
How To Reproduce
- Indicate the global behavior of your application project
We're building a PTP master / slave implementation in Zephyr on STM32 that uses hardware timestamping. Majority of the implementation is represented by this commit: https://github.com/maka-autonomous-robotic-systems/zephyr/commit/c25c72f91996f69dcee737e405272edcf951f560
- The modules that you suspect to be the cause of the problem (Driver, BSP, MW ...)
drivers/src/stm32h7xx_hal_eth.c
- The use case that generates the problem
Receive timestamping is enabled, hardware is filling RX normal and context descriptors and issues interrupts. When driver code is awoken and runs HAL_ETH_IsRxDataAvailable()
, it might either see just normal RX descriptor available or both normal and context RX descriptors, even if both were supposed to be available. In former case, context RX descriptor is populated later, but since it's not accounted by HAL_ETH_IsRxDataAvailable()
it is never released back and is thus leaked.
- How we can reproduce the problem
One option is to use forked Zephyr code from this commit: https://github.com/maka-autonomous-robotic-systems/zephyr/commit/c25c72f91996f69dcee737e405272edcf951f560 and run gptp sample on STM32H7 board for a while. You'd want to revert this change (https://github.com/maka-autonomous-robotic-systems/zephyr/commit/1294620e92ef8ab89d03c7dc4978e69122c32e69) to switch back to unmodified STM32H7 HAL.
Another option is to make a custom example that enables timestamping similar to that commit, and observe descriptors leaked nondeterministically.
Additional context
PR with a proposed solution has been submitted to Zephyr Project's fork of STM32 HAL: https://github.com/zephyrproject-rtos/hal_stm32/pull/116
Please let me know if additional information would be helpful or if you have any feedback.
Hi @ALABSTM, any update on this issue?
We'd like to get at least an acknowledgement of the issue, so the fix proposed in zephyrproject-rtos/hal_stm32#116 can be merged.
Hi @alsrgv,
Please excuse this delayed reply. Our development teams said this issue is covered by the new ETH driver. This new ETH driver will be published soon. Unfortunately, we cannot share a date for the moment.
Thank you for your comprehension and thank you again for having reported this point.
With regards,
Hi @alsrgv,
I hope you are fine. We just made available a fully reworked ETH HAL driver via pull-request stm32h7xx_hal_driver#16. The point you raised should have been addressed. You can use this new driver for your applications. Please feel free to provide your feedback here.
Please note also that this new ETH HAL driver breaks the compatibility with the previous one. We are looking forward to reading your feedback.
With regards,
@ALABSTM, thanks for fixing this up! Do you know if anybody is working on porting Zephyr to this new API?
Hi @alsrgv,
The point has already been reported at the Zephyr OS project-level. Here is the link.
With regards,
Hi @alsrgv,
The new HAL ETH driver has been made available in version 1.10.0 of this firmware, which has been published since a while. Please allow me to close this issue. Thank you again for having reported this point.
With regards,