threadx icon indicating copy to clipboard operation
threadx copied to clipboard

vfp_enable breaks FileX on Cortex-R5

Open errikosmes opened this issue 1 year ago • 6 comments

Description

  • The vfp_enable function seems to break something in FileX on Cortex-R5.
    • fx_file_create with a RAM disk returns 0x89 (FX_SECTOR_INVALID)
      • Traced down to logical_sector in fx_directory_entry_read.c:236 calculation being wrong. FX struct corruption ?
        /* At this point, the directory data sector needs to be read.  */
        logical_sector =    ((ULONG)media_ptr -> fx_media_data_sector_start) +
            (((ULONG)cluster - FX_FAT_ENTRY_START) *
             ((ULONG)media_ptr -> fx_media_sectors_per_cluster)) +
            relative_sector;
  • The Cortex-R5 vfp implementation seems a little bit suspect
    • To activate vfp support tx_thread_schedule.S modifies a field in TX_THREAD with an offset of 144 but there is no dedicated vfp field.
    • There is a comment in TX_THREAD saying that "Nothing after this point is referenced by the target-specific assembly language. ". The 144 offset violates that
    • TX_THREAD fields can move depending on ifdefs and/or compiler padding. But the vfp context handling relies on a fixed offset to an undefined location.
#ifdef TX_ENABLE_VFP_SUPPORT
    .global tx_thread_vfp_enable
    .type  tx_thread_vfp_enable,function
tx_thread_vfp_enable:
    MRS     r2, CPSR                            @ Pickup the CPSR
#ifdef TX_ENABLE_FIQ_SUPPORT
    CPSID   if                                  @ Disable IRQ and FIQ interrupts
#else
    CPSID   i                                   @ Disable IRQ interrupts
#endif
    LDR     r0, =_tx_thread_current_ptr         @ Build current thread pointer address
    LDR     r1, [r0]                            @ Pickup current thread pointer
    CMP     r1, #0                              @ Check for NULL thread pointer
    BEQ     __tx_no_thread_to_enable            @ If NULL, skip VFP enable
    MOV     r0, #1                              @ Build enable value
    STR     r0, [r1, #144]                      @ Set the VFP enable flag (tx_thread_vfp_enable field in TX_THREAD)
__tx_no_thread_to_enable:
    MSR     CPSR_cxsf, r2                       @ Recover CPSR
    BX      LR                                  @ Return to caller

Environment

  • Cortex-R5 on a Xilinx UltraScale+ SoC
  • armr5-none-eabi-gcc 10.2.0, using -o0
  • 6.1.11 ThreadX
  • 6.1.8 FileX

To Reproduce

  • You can use this minimal example:
#include <stdio.h>

#include "fx_api.h"
#include "tx_api.h"
#include "tx_zynqmp.h"
#include "xparameters.h"

extern VOID _fx_ram_driver(FX_MEDIA *media_ptr);  // Define RAM device driver entry.

#define STACK_SIZE 10000
#define KERNEL_BYTE_POOL_SIZE 20000

// Memory used for the thread stacks. ARM-R5 stacks need to be aligned 64-bit
static uint8_t s_memoryPool[KERNEL_BYTE_POOL_SIZE] __attribute__((aligned(8)));

TX_BYTE_POOL g_kernelPool;
static TX_THREAD s_thread;

void thread_entry(uint32_t thread_input);

void tx_application_define(void *first_unused_memory) {
    char *threadStackPointer;
    threadStackPointer = (char *)first_unused_memory;  // Put first available memory address
                                                       // into a character pointer.
    uint32_t txStatus;

    // Create a memory pool for the thread stacks (kernel pool)
    txStatus = tx_byte_pool_create(&g_kernelPool, (char *)"kernel pool", s_memoryPool,
                                   KERNEL_BYTE_POOL_SIZE);
    if (txStatus != TX_SUCCESS) printf("line=%d, error\r\n", __LINE__);

    // Allocate a stack for the thread
    txStatus =
        tx_byte_allocate(&g_kernelPool, (void **)&threadStackPointer, STACK_SIZE, TX_NO_WAIT);
    if (txStatus != TX_SUCCESS) printf("line=%d, error\r\n", __LINE__);

    txStatus = tx_thread_create(
        &s_thread,              // Pointer to a thread control block
        (char *)"Test thread",  // Pointer to the name of the thread
        thread_entry,           // Thread function pointer
        0,                      // 32-bit value passed to the thread
        threadStackPointer,     // Stack start
        STACK_SIZE,             // Stack size
        1,                      // Thread priority
        1,                 // Thread preemption threshold **NEEDS TO BE THE SAME AS thread priority
        TX_NO_TIME_SLICE,  // Number of ticks allowed to run before executing thread with the same
                           // priority lvl
        TX_AUTO_START);    // Start immediately or wait
    if (txStatus != TX_SUCCESS) printf("line=%d, error\r\n", __LINE__);

    // Initialize software modules and peripherals
    fx_system_initialize();  // Init FileX filesystem
}

int main(void) {
    tx_zynqmp_gic_init();     // Initialize global interrupt controller
    tx_zynqmp_malloc_init();  // Use a mutex when calling malloc
    tx_kernel_enter();        // Enter the ThreadX kernel.
}

#define RAMDISK_SECTOR_SIZE 128
#define RAMDISK_NB_SECTORS 256

FX_MEDIA g_RamDisk;                                                     ///< RAM disk handler
static uint8_t s_ramDiskMem[RAMDISK_NB_SECTORS * RAMDISK_SECTOR_SIZE];  ///< RAM disk memory
static uint8_t s_ramDiskCache[RAMDISK_SECTOR_SIZE];                     ///< RAM disk cache

void thread_entry(uint32_t thread_input) {
    tx_thread_vfp_enable();
    uint32_t fxStatus = FX_SUCCESS;

    printf("Thread entry\r\n");

    /*
     * Initialize RAM disk
     */
    char *kermitRamDiskName = (char *)"RAM DISK";
    fxStatus = fx_media_format(&g_RamDisk,              ///< RAM disk struct
                               _fx_ram_driver,          ///< RAM disk driver entry
                               s_ramDiskMem,            ///< RAM disk memory pointer
                               s_ramDiskCache,          ///< Media cache buffer pointer
                               sizeof(s_ramDiskCache),  ///< Media cache buffer size
                               kermitRamDiskName,       ///< Volume Name
                               1,                       ///< Number of FATs
                               32,                      ///< Directory Entries
                               0,                       ///< Hidden sectors
                               RAMDISK_NB_SECTORS,      ///< Total sectors
                               RAMDISK_SECTOR_SIZE,     ///< Sector size
                               1,                       ///< Sectors per cluster
                               1,                       ///< Heads
                               1);                      ///< Sectors per track
    if (fxStatus != FX_SUCCESS) printf("line=%d, error=0x%lx\r\n", __LINE__, fxStatus);

    fxStatus = fx_media_open(&g_RamDisk, kermitRamDiskName, _fx_ram_driver, s_ramDiskMem,
                             s_ramDiskCache, sizeof(s_ramDiskCache));
    if (fxStatus != FX_SUCCESS) printf("line=%d, error=0x%lx\r\n", __LINE__, fxStatus);

    const char *file_name = "test_file";

    printf("vfp enabled\r\n");
    fxStatus = fx_file_create(&g_RamDisk, (char *)file_name);
    if (fxStatus == FX_SUCCESS) {
        printf("filex OK\r\n");
    }
    else {
        printf("filex error=0x%lx\r\n", fxStatus);
    }

    tx_thread_vfp_disable();
    printf("vfp disabled\r\n");
    fxStatus = fx_file_create(&g_RamDisk, (char *)file_name);
    if (fxStatus == FX_SUCCESS) {
        printf("filex OK\r\n");
    }
    else {
        printf("filex error=0x%lx\r\n", fxStatus);
    }
}
  • Activating vfp makes FileX fail, deactivating it makes it work again...

Impact

  • This bug, especially when looking at the assembly code implementation with the 144 offset puts into question the vfp implementation in general.
  • If it's the cause for FileX breaking what else could it break
  • What if TX_THREAD + 144 happens to be 0 at one point and vfp handling gets deactivated resulting in float reg corruption. Really hard to debug if it happens

errikosmes avatar Apr 26 '24 12:04 errikosmes

This seems to verify my concern with the R5 port and the TX_THREAD struct.

On the cortex A9 port, TX_THREAD_EXTENSION_2 is defined to be ULONG tx_thread_vfp_enable; Link. This allocates a flag to store the vfp enable. See tx_api.h

On the cortex R5 port, TX_THREAD_EXTENSION_2 is defined to nothing. So TX_THREAD + 144 points to the next thing in the struct which is tx_thread_filex_ptr, see here

So it makes sense that vfp_enable is going to break FileX since it'll write over the tx_thread_filex_ptr value.

You can try to update the line here on the cortex R5 to match the A9 port like this: #define TX_THREAD_EXTENSION_2 ULONG tx_thread_vfp_enable;

This should allocate space for that flag in the TX_STRUCT but not sure if it's going to throw off any other offsets.

austinawolf avatar May 30 '24 15:05 austinawolf

My apologies for the delayed response. I am cleaning our issue and PR backlog right now.

I will flag this issue for discussion with our group of committers.

fdesbiens avatar Feb 27 '25 15:02 fdesbiens

@eclipse-threadx/iot-threadx-committers and @eclipse-threadx/iot-threadx-contributors: Please provide your feedback.

fdesbiens avatar Mar 19 '25 18:03 fdesbiens

This is a valid problem. The tx_port.h file must be updated - similar to the Cortex-A9 tx_port.h file - to support floating point save/restore.

billlamiework avatar Apr 14 '25 20:04 billlamiework

This could impact additional ports. @billlamiework is to look deeper into this.

fdesbiens avatar Apr 15 '25 16:04 fdesbiens

Looks like we would need to review at least all the Cortex-R4 and Cortex-R5 ports. The GNU versions for Cortex-R4/R5 are missing the information in tx_port.h. The IAR ports look reasonable. However, all the ports should be reviewed, e.g., Arm5/6, GHS, GNU, and IAR. Also, the assembly code is assuming a 144 offset in the thread control block for the tx_thread_vfp_enable field. Hence, the FileX tx_thread_filex_ptr addition should be somewhere after tx_thread_vfp_enable.

billlamiework avatar Apr 22 '25 00:04 billlamiework