pico-feedback icon indicating copy to clipboard operation
pico-feedback copied to clipboard

Bug in UART FIFO with non-blocking background send?

Open flaviut opened this issue 3 years ago • 2 comments

This is following up on my post in the forum at https://forums.raspberrypi.com/viewtopic.php?p=2050416

In short: unless I artificially slow down writing data to the UART FIFO, 15-byte chunks of my outgoing messages simply disappear. This doesn't happen every time, but it does happen every other time or so, depending on the length of the message being sent:

checksum mismatch: buf=7e002c10010000000000000000fffe6d61696e6c6f6f703a203333757320283939352073616d706c65732920 ('~,~mainloop: 33us (995 samples) '), len=44, actualCsum=86, expectedCsum=32
> Packet being parsed doesn't fit allocated buffer.
Consider increasing parser_buffer_size option.
P: mainloop: 33us (996 samples) per execution
P: Temperature: 20 C
checksum mismatch: buf=7e002c10010000000000000000fffe6d61696e6c6f6f703a203333757320283939362073616d706c65732920 ('~,~mainloop: 33us (996 samples) '), len=44, actualCsum=85, expectedCsum=32
> Packet being parsed doesn't fit allocated buffer.
Consider increasing parser_buffer_size option.
P: mainloop: 32us (996 samples) per execution
P: Temperature: 22 C

I configure my UART like this:

    // ignore the actual baud rate, probably off by 5-10Hz or so & we can't
    // warn about it since our logging isn't ready yet
    [[maybe_unused]] auto actualBaud = uart_init(uart, baud);

    gpio_set_function(pins.tx, GPIO_FUNC_UART);
    gpio_set_function(pins.rx, GPIO_FUNC_UART);

    uart_set_fifo_enabled(uart, true);
    uart_set_hw_flow(uart, false, false);
    uart_set_format(uart, 8, 1, UART_PARITY_NONE);

    // override the interrupt thresholds
    // set rx threshold: becomes > 7/8 full, b100
    hw_write_masked(&uart_get_hw(uart)->ifls, 0b100, UART_UARTIFLS_RXIFLSEL_BITS);
    // set tx threshold: becomes < 1/8 full, b000
    hw_write_masked(&uart_get_hw(uart)->ifls, 0b000, UART_UARTIFLS_TXIFLSEL_BITS);

    uartIV.register_callback(instance, irqCallback);
    int irq;
    if (instance == 0) {
        irq = UART0_IRQ;
        irq_set_exclusive_handler(irq, uartIrqHandler0);
    } else {
        irq = UART1_IRQ;
        irq_set_exclusive_handler(irq, uartIrqHandler1);
    }
    // enable the kinds of interrupts we want to handle
    hw_set_bits(&uart_get_hw(uart)->imsc,
                UART_UARTIMSC_FEIM_BITS | UART_UARTIMSC_PEIM_BITS |
                UART_UARTIMSC_BEIM_BITS | UART_UARTICR_OEIC_BITS |
                UART_UARTIMSC_RTIM_BITS |
                UART_UARTIMSC_TXIM_BITS | UART_UARTIMSC_RXIM_BITS);
    irq_set_enabled(irq, true);

The initial send happens by copying all the data into txBuffer, and then calling sendAllNonblocking from the non-ISR context.

The relevant part of my ISR handler is:

__attribute__((flatten)) __attribute__((optimize("O3")))
void RpUart::sendAllNonblocking() {
    defaultCover.assertSync();  // verify isrs are disabled
    volatile uart_hw_t *regs = uart_get_hw(uart);
    while (uart_is_writable(uart) && !txBuffer.empty()) {
        regs->dr = txBuffer.front(); // I'd previously tried uart_putc_raw too
        txBuffer.pop_front();
    }
}

I've done some additional testing with a delay inside the loop long enough that each message gets fully sent out within a single call to sendAllNonblocking():

__attribute__((flatten)) __attribute__((optimize("O3")))
void RpUart::sendAllNonblocking() {
    defaultCover.assertSync();
    volatile uart_hw_t *regs = uart_get_hw(uart);
    while (uart_is_writable(uart) && !txBuffer.empty()) {

        for(volatile int i = 0; i < 800; i++) {
            tight_loop_contents();
        }

        regs->dr = txBuffer.front();
        txBuffer.pop_front();
    }

This works fine, but now interrupts are disabled for a significant period of time.

flaviut avatar Nov 05 '22 18:11 flaviut

I don't think this a bug with the SDK--I'm using very low-level methods generally, not something complicated with lots of places for bugs to hide. I think there's some sort of bug with the actual UART FIFO hardware, but I'm not familiar with how hardware works to say anything conclusively.

flaviut avatar Nov 06 '22 02:11 flaviut

So it turns out that isn't a bug after all--and it is in the SDK.

Looks like by default, the PICO_STDIO_UART option is set, which means that the default stdio UART driver is loaded in.

This is a problem in this case, since I wrote my own custom driver here, and having them both try and output to the UART simultaneously causes the observed data corruption. The fix is straightforward:

# we need to remove the default UART driver since we replace it with our own, and having it
# in place will cause corruption
set(PICO_STDIO_UART 0)

Leaving this open in case you want to track it as a documentation issue, but otherwise, I have no further problems/questions here.

flaviut avatar Nov 06 '22 17:11 flaviut