esp32s3 : float calc error or stack error
Hi, I was running PX4 based on NuttX on esp32s3 and found an error.
a float data will be nan after a simple multiplication.
And the console output:
This problem will appear after booting for several minutes. Moreover, some function calling will make the same thing, making the float varible Nan.
defconfig:
CONFIG_ALLOW_BSD_COMPONENTS=y
CONFIG_ALLOW_GPL_COMPONENTS=y
CONFIG_ALLOW_MIT_COMPONENTS=y
CONFIG_ALLOW_ECLIPSE_COMPONENTS=y
CONFIG_ALLOW_ICS_COMPONENTS=y
CONFIG_BASE_DEFCONFIG="-dirty"
CONFIG_INTELHEX_BINARY=y
CONFIG_ARCH_SETJMP_H=y
# CONFIG_NDEBUG is not set
CONFIG_STACK_COLORATION=y
CONFIG_CCACHE=y
CONFIG_ARCH_XTENSA=y
CONFIG_PWM_MULTICHAN=y
CONFIG_PWM_NCHANNELS=8
CONFIG_ARCH_CHIP_ESP32S3=y
CONFIG_ESP32S3_INSTRUCTION_CACHE_32KB=y
CONFIG_ESP32S3_DATA_CACHE_64KB=y
CONFIG_ESP32S3_DATA_CACHE_LINE_64B=y
CONFIG_ESP32S3_SPI2=y
CONFIG_ESP32S3_SPI3=y
CONFIG_ESP32S3_UART0=y
CONFIG_ESP32S3_UART1=y
CONFIG_ESP32S3_UART2=y
CONFIG_ESP32S3_WIFI=y
CONFIG_ESP32S3_I2C0=y
CONFIG_ESP32S3_I2C1=y
CONFIG_ESP32S3_LEDC=y
CONFIG_ESP32S3_USBSERIAL=y
CONFIG_ESP32S3_GPIO_IRQ=y
CONFIG_ESP32S3_SPI_SWCS=y
CONFIG_ESP32S3_SPI_UDCS=y
CONFIG_ESP32S3_SPI_DMA=y
CONFIG_ESP32S3_SPI_DMA_BUFSIZE=4092
CONFIG_ESP32S3_SPI_DMATHRESHOLD=4
CONFIG_ESP32S3_SPI2_CSPIN=1
CONFIG_ESP32S3_SPI2_CLKPIN=2
CONFIG_ESP32S3_SPI2_MOSIPIN=42
CONFIG_ESP32S3_SPI2_MISOPIN=41
CONFIG_ESP32S3_SPI3_CSPIN=35
CONFIG_ESP32S3_SPI3_CLKPIN=39
CONFIG_ESP32S3_SPI3_MOSIPIN=38
CONFIG_ESP32S3_SPI3_MISOPIN=36
CONFIG_ESP32S3_UART2_TXPIN=8
CONFIG_ESP32S3_UART2_RXPIN=3
CONFIG_ESP32S3_I2C0_SCLPIN=45
CONFIG_ESP32S3_I2C0_SDAPIN=48
CONFIG_ESP32S3_I2C1_SCLPIN=15
CONFIG_ESP32S3_I2C1_SDAPIN=16
CONFIG_ESP32S3_I2CTIMEOMS=10
CONFIG_WPA_WAPI_PSK=y
CONFIG_ESP32S3_WIFI_STATION_SOFTAP=y
CONFIG_ESP32S3_WIFI_STATIC_RXBUF_NUM=16
CONFIG_ESP32S3_WIFI_DYNAMIC_RXBUF_NUM=64
CONFIG_ESP32S3_WIFI_DYNAMIC_TXBUF_NUM=64
CONFIG_ESP32S3_WIFI_RXBA_AMPDU_WZ=16
CONFIG_ESP32S3_FLASH_MODE_QIO=y
CONFIG_ESP32S3_FLASH_FREQ_120M=y
CONFIG_ESP32S3_LEDC_TIM0=y
CONFIG_ESP32S3_LEDC_TIM0_CHANNELS=4
CONFIG_ESP32S3_LEDC_CHANNEL0_PIN=10
CONFIG_ESP32S3_LEDC_CHANNEL1_PIN=9
CONFIG_ESP32S3_LEDC_CHANNEL3_PIN=37
CONFIG_ESP32S3_LEDC_CHANNEL4_PIN=13
CONFIG_ESP32S3_LEDC_CHANNEL5_PIN=14
CONFIG_ESP32S3_LEDC_CHANNEL6_PIN=21
CONFIG_ESP32S3_LEDC_CHANNEL7_PIN=47
CONFIG_BOARD_LOOPSPERMSEC=16717
CONFIG_ARCH_INTERRUPTSTACK=2048
CONFIG_RAM_START=0x20000000
CONFIG_RAM_SIZE=114688
CONFIG_ARCH_BOARD_CUSTOM_NAME="px4"
CONFIG_ARCH_BOARD_CUSTOM_DIR="../../../../boards/px4/esp32s3/nuttx-config"
CONFIG_ARCH_BOARD_COMMON=y
CONFIG_ESP32S3_SPEED_UP_ISR=y
CONFIG_BOARDCTL_RESET=y
CONFIG_USEC_PER_TICK=1000
CONFIG_START_YEAR=2011
CONFIG_START_MONTH=12
CONFIG_START_DAY=6
CONFIG_PREALLOC_TIMERS=4
CONFIG_SPINLOCK=y
CONFIG_INIT_STACKSIZE=8192
CONFIG_INIT_ENTRYPOINT="nsh_main"
CONFIG_TASK_NAME_SIZE=48
CONFIG_SCHED_WAITPID=y
CONFIG_PTHREAD_MUTEX_TYPES=y
CONFIG_SCHED_INSTRUMENTATION=y
CONFIG_SCHED_INSTRUMENTATION_SWITCH=y
CONFIG_NAME_MAX=48
CONFIG_SIG_DEFAULT=y
CONFIG_PREALLOC_MQ_MSGS=64
CONFIG_SCHED_HPWORK=y
CONFIG_SCHED_HPWORKSTACKSIZE=2048
CONFIG_SCHED_LPWORK=y
CONFIG_SCHED_LPWORKSTACKSIZE=2048
CONFIG_DEFAULT_TASK_STACKSIZE=4096
CONFIG_IDLETHREAD_STACKSIZE=3072
CONFIG_PTHREAD_STACK_MIN=2048
CONFIG_I2C_RESET=y
CONFIG_I2C_DRIVER=y
CONFIG_SPI_DRIVER=y
CONFIG_TIMER=y
CONFIG_DEV_GPIO=y
CONFIG_DEV_ZERO=y
CONFIG_DEV_ASCII=y
CONFIG_MTD=y
CONFIG_MTD_PARTITION=y
CONFIG_MTD_PARTITION_NAMES=y
CONFIG_MTD_BYTE_WRITE=y
CONFIG_MTD_CONFIG=y
CONFIG_MTD_RAMTRON=y
CONFIG_RAMTRON_SETSPEED=y
CONFIG_PIPES=y
CONFIG_DEV_PIPE_MAXSIZE=1024
CONFIG_DEV_PIPE_SIZE=70
CONFIG_SERIAL_NPOLLWAITERS=6
CONFIG_SERIAL_TERMIOS=y
CONFIG_UART0_RXBUFSIZE=128
CONFIG_UART0_TXBUFSIZE=128
CONFIG_UART1_RXBUFSIZE=128
CONFIG_UART1_TXBUFSIZE=128
CONFIG_UART2_RXBUFSIZE=128
CONFIG_UART2_TXBUFSIZE=128
CONFIG_DRIVERS_WIRELESS=y
CONFIG_DRIVERS_IEEE80211=y
CONFIG_SYSLOG_BUFFER=y
CONFIG_SYSLOG_BUFSIZE=256
CONFIG_SYSLOG_DEVPATH=""
CONFIG_DMA=y
CONFIG_NET_ETH_PKTSIZE=1514
CONFIG_NETDEV_LATEINIT=y
CONFIG_NETDEV_PHY_IOCTL=y
CONFIG_NETDEV_WIRELESS_IOCTL=y
CONFIG_NET_BINDTODEVICE=y
CONFIG_NET_TCP=y
CONFIG_NET_TCP_DELAYED_ACK=y
CONFIG_NET_TCP_WRITE_BUFFERS=y
CONFIG_NET_UDP=y
CONFIG_NET_BROADCAST=y
CONFIG_NET_UDP_WRITE_BUFFERS=y
CONFIG_NET_ICMP=y
CONFIG_NET_ICMP_SOCKET=y
CONFIG_FS_LARGEFILE=y
CONFIG_FS_FAT=y
CONFIG_FAT_COMPUTE_FSINFO=y
CONFIG_FS_FATTIME=y
CONFIG_FS_ROMFS=y
CONFIG_FS_CROMFS=y
CONFIG_FS_SMARTFS=y
CONFIG_FS_BINFS=y
CONFIG_FS_PROCFS=y
CONFIG_FS_PROCFS_REGISTER=y
CONFIG_MM_REGIONS=2
CONFIG_IOB_NBUFFERS=124
CONFIG_IOB_THROTTLE=24
CONFIG_WIRELESS=y
CONFIG_POSIX_SPAWN_DEFAULT_STACKSIZE=2048
CONFIG_TLS_NELEM=4
CONFIG_TLS_TASK_NELEM=4
CONFIG_NETDB_DNSCLIENT=y
CONFIG_BUILTIN=y
CONFIG_HAVE_CXX=y
CONFIG_HAVE_CXXINITIALIZE=y
CONFIG_BENCHMARK_COREMARK=y
CONFIG_EXAMPLES_DHCPD=y
CONFIG_NETUTILS_DHCPD=y
CONFIG_NETUTILS_DHCPD_STACKSIZE=2048
CONFIG_NETINIT_WAPI_SSID="MY_PX4"
CONFIG_NETINIT_WAPI_PASSPHRASE="12345678"
CONFIG_NSH_LINELEN=128
CONFIG_NSH_MAXARGUMENTS=15
CONFIG_NSH_NESTDEPTH=8
CONFIG_NSH_BUILTIN_APPS=y
# CONFIG_NSH_CMDOPT_HEXDUMP is not set
CONFIG_NSH_FILEIOSIZE=512
CONFIG_NSH_ROMFSETC=y
CONFIG_NSH_CROMFSETC=y
CONFIG_NSH_ROMFSSECTSIZE=128
CONFIG_NSH_ARCHINIT=y
CONFIG_SYSTEM_ARGTABLE3=y
CONFIG_SYSTEM_NSH=y
CONFIG_WIRELESS_WAPI=y
CONFIG_WIRELESS_WAPI_CMDTOOL=y
CONFIG_WIRELESS_WAPI_STACKSIZE=8192
Hardware: ESP32S3-WROOM-1 M0N16R8
NuttX version: 12.4 , commit : 0f169f50c4b234abde12a6a0b028a8fe8f62f5aa
Full source code: https://1drv.ms/u/c/008ed313fdaa343c/EaXGLgJs_3VLpahnyyVtaL4BgF3pUIa_6f1XHX_ZxOb-Ow?e=1iSk8w
GCC toolchain: https://github.com/espressif/crosstool-NG/releases/tag/esp-13.2.0_20240530
Hi @w2016561536 thank you for reporting the issue. Is there some way to reproduce this issue just creating a simple test on NuttX mainline, without using all these IMU files from PX4? If you can isolate the issue it will help us to find the root cause. Just for awareness ping @tmedicci
Well, it seems to be difficult to reproduce. But I perhaps find the problem. File https://github.com/apache/nuttx/blob/b09b429308b991ba455cad57b53e0abaa423bf53/arch/xtensa/src/common/xtensa_user_handler.S#L363C1-L363C23, we can find that this function does not correctly implemented. In FreeRTOS, implemention is https://github.com/espressif/esp-idf/blob/cadf80e8751caffaf25207a12bb65e5b188683ae/components/freertos/FreeRTOS-Kernel/portable/xtensa/xtensa_vectors.S#L990. And this funtion has a related issue: https://github.com/espressif/esp-idf/issues/11690 , very similar to this issue
@tmedicci Do you think this problem is caused by fpu ?
@tmedicci Do you think this problem is caused by fpu ?
Hi @w2016561536, I am not aware of it. Maybe, you could try to implement the workaround and I can evaluate using our internal CI.
@w2016561536 did you try to save the FP registers?
If it fixes the issue we should include it into mainline. Maybe wrapped by #ifdef CONFIG_ARCH_FPU
Hey guys, I am the reporter of the original problem in ESP-IDF FreeRTOS. Yes, this is a silent data corruption and the current ESP-IDF's interrupt vector assembly file has a fix. Regardless of whether this specific issue is caused by the same bug (very likely), you should update the vectors to match upstream :)
@ProfFan could you point out the patch? we can apply the change, thanks.
i guess he meant this one. https://github.com/espressif/esp-idf/issues/11690
but the change look like FreeRTOS specific: https://github.com/espressif/esp-idf/commit/b03c8912c73fa59061d97a2f5fd5acddcc3fa356#diff-db429b5abb80b87b6da1abb1ecd103c81fc2d982780bd8a3f1a23494b1749155R1152
Perhaps this bug needs Espressif staff to work on.
@fdcavalcanti @eren-terzioglu @tmedicci please take a look ^
The FPU vs non-FPU can be checked by disabling the FPU and trying if the issue will be reproduced with integer emulated math libs
I'm sorry, @xiaoxiang781216 , the issue ID that https://github.com/apache/nuttx/pull/14481 solves is different. I already fixed it. I'm sorry.
@w2016561536 maybe we can work together to get PX4 working on ESP32, ESP32-S2 and ESP32-S3. @henrykotze is working on PX4 for ESP32 and I want to run NuttX on ESP32-S2 to run on this device:
https://aliexpress.com/item/1005006845550308.html
@w2016561536 maybe we can work together to get PX4 working on ESP32, ESP32-S2 and ESP32-S3. @henrykotze is working on PX4 for ESP32 and I want to run NuttX on ESP32-S2 to run on this device:
https://aliexpress.com/item/1005006845550308.html
Good idea! But I think fpu is necessary for this complex task, however esp32-s2 doesn't have. I have tried to port PX4 for esp32s3 and uploaded to https://github.com/w2016561536/PX4-Autopilot/tree/px4_esp32s3 And here, Guanglun has finished PX4 for esp32 https://github.com/guanglun/PX4-Autopilot/tree/single_core_esp32
This problem seems to be fixed in new compile tool. https://github.com/espressif/crosstool-NG/releases/tag/esp-14.2.0_20241119
Sorry for my wrong diagnosis, after update to releases 12.9 (On 12.4 too), this problem still exists. New compiler does not solve this problem completely.
Maybe we could see this commit: d835b4bd03a9de594ac6d66998c1e916e76ab86f In this commit, fpu registers have been saved and restore while context switch happens. I have tried to apply it on esp32s3. Fpu seems to work correctly. But I do not know how to fully test the fpu and do not know whether esp32s3 and esp32 have same fpu design.