threadx icon indicating copy to clipboard operation
threadx copied to clipboard

Seams that task sheduler handles timeouts incorrectly.

Open asamuta opened this issue 1 year ago • 3 comments

System components timeout issue Call to system functions like tx_event_flags_get, tx_semaphore_get etc. with timeout != TX_WAIT_FOREVER results in thread wake before timeout in some conditions.

Description In my program I am using threadx with the netxduo on with STM32H723 processor. From the high level the program creates server socket and receives commands from the network. There is a command to make a USART transaction and return the response back via the socket. The transaction is just USART send/receive with timeout. There is "DeviceMaster" thread, it makes USART transactions. USART is configured to transmit/receive data using DMA and USART idle interrupt.

DMA TX ISR handle:

...
tx_event_flags_set(&deviceMaster.eventFlags, DM_WRITE_FLAG, TX_OR);
...

USART ISR:

    /* Check for IDLE line interrupt */
    if (LL_USART_IsEnabledIT_IDLE(UART_INSTANCE) && LL_USART_IsActiveFlag_IDLE(UART_INSTANCE)) {
        LL_USART_ClearFlag_IDLE(UART_INSTANCE);                     /* Clear IDLE line flag */
        ...
        if (tx_event_flags_set(&deviceMaster.eventFlags, DM_READ_FLAG, TX_OR) != TX_SUCCESS) {
            Error_Handler();
        }
    }

Transaction processing in the thread pseudo code, simplified:

// Checks if previous send completed
tx_event_flags_get(&deviceMaster.eventFlags, DM_WRITE_FLAG, TX_OR_CLEAR, &flags, 10);
sendDataToUsart();
// wait for remote device response with timeout=10
tx_event_flags_get(&deviceMaster.eventFlags, DM_READ_FLAG, TX_OR_CLEAR, &flags, 10);
getDataFromUsart()

Problem Sometimes call tx_event_flags_get(&deviceMaster.eventFlags, DM_READ_FLAG, TX_OR_CLEAR, &flags, 10); takes 1-2 ticks instead of 10 (ends before timeout). I have investigated the behavior a little bit using debug out and TraceX. What I have fonud is that ethernet ISR happened when the "DeviceMaster" thread suspended wakes not only ethernet-related thread but the "DeviceMaster" as well. Attaching TraceX file.

TraceX good/expected flow Event/Description

1017 - Transaction start marker; start DMA transfer
1018 - check for DM_WRITE_FLAG
1019 - DM_WRITE_FLAG set in USART TX DMA
1020 - Get time to check later (tx_get_time)
1021 - Wait for DM_READ_FLAG flag
1022 -  "DeviceMaster" suspended
...
1024 - DM_READ_FLAG set in the USART RX ISR
1025 - system resumes "DeviceMaster"
1026 - Transaction end marker

TraceX problem flow

1036 - Transaction start marker; start DMA transfer
1037 - check for DM_WRITE_FLAG
1038 - DM_WRITE_FLAG set in USART TX DMA
1039 - Get time to check later (tx_get_time)
1040 - Wait for DM_READ_FLAG flag
1041 - "DeviceMaster" suspended
1042 - "NetX ip instance" has started
...
1047 - set event
1048 - scheduler resumes "NetX ip instance"
1049 - scheduler resumes "DeviceMaster", but it still should wait for the timeout.
1050 - "DeviceMaster" gets console mutex to print timeout error
1051 - "DeviceMaster" puts console mutex back
1052 - "DeviceMaster" gets time to check how many ticks the thread was suspended.
1053 - Custom TraceX event that indicates the error and the number of ticks the thread was suspended = 2 ticks, not 10

Assumptions about root cause I have taken a quick look at the threadx sources and assume that the following logic in tx_timer_expiration_process.c file probably is not correct. this and this

if (current_timer -> tx_timer_internal_remaining_ticks > TX_TIMER_ENTRIES) 
{
...
} else {
 /* Timer did expire.  */
}                    

Not sure, but looks like in case if tx_timer_internal_remaining_ticks < 32 the system considers the timer is expired. But in my case the timer value is just 10...

Expected behavior The system scheduler wakes threads when the timeout happened not before.

Impact Timeouts work incorrectly across the system

TraceX files CustomEvents.zip Ticks to microsecond is 544 in this case.

asamuta avatar Sep 27 '24 23:09 asamuta