arduino-esp32 icon indicating copy to clipboard operation
arduino-esp32 copied to clipboard

Issue realted ESP32-S2 CDC to the Serial.setDebugOutput(true);

Open SuGlider opened this issue 3 years ago • 11 comments

I am starting to think that the current cause of the problem (on v2.0.3 stable) is some kind of race/lockup between two or more tasks, all of which want to write to the USB CDC. In my case, this happens due to concurrent use of log_X statements in more than one task. This scenario works correctly when using native serial pins.

Consider the following sketch:

#include <Arduino.h>

void setup()
{
  Serial.begin(115200);
  // setDebugOutput() call commented out for test
  //Serial.setDebugOutput(true);
  delay(2000);
}

void loop()
{
  static uint32_t counter = 1;

  Serial.printf("Using Serial.printf(), this is line %u of output, at msec %u...\r\n", counter, millis());
  //log_i("Using log_i(), this is line %u of output, at msec %u...", counter, millis());
  counter++;

  // No delay at all, blast port at full speed!
}

This is a single task using the serial port. When using USB CDC as the serial port, no combination of use or non-use of setDebugOutput and Serial.printf/log_i manages to lockup the board.

However, my sketches use WiFi, which starts a couple of tasks, each outputting via log_X. Also, AsyncTCPSock, which starts its own task. With this, I have noticed that the startup process, where all of these tasks display debugging strings, is the more likely place where the sketch hangs on output. After this, only the main loop keeps outputting strings, and it runs more or less normally.

Originally posted by @avillacis in https://github.com/espressif/arduino-esp32/issues/6221#issuecomment-1124215183

I have been reviewing the code of USBCDC.cpp , which handles the USB serial port in ESP32-S2. So far I have noticed the following:

  • The setDebugOutput() call installs a function, cdc0_write_char as a character output for debugging with ets_install_putc1.
  • The cdc0_write_char function, in turn, calls USBCDC::write(uint8_t), and through it, USBCDC::write(const uint8_t *, size_t).
  • The USBCDC::write(const uint8_t *, size_t) appears to acknowledge that it might be called from within an ISR context, as it checks with xPortInIsrContext(), and uses xSemaphoreTakeFromISR() or xSemaphoreTake() depending on the result.
  • According to the Espressif documentation, xSemaphoreTakeFromISR() outputs a flag, pxHigherPriorityTaskWoken, that is supposed to inform whether a higher-priority task was awoken as a result of calling the ISR-specific function. And if so, the function is supposed to request a context switch by calling portYIELD_FROM_ISR().
  • HOWEVER... the current implementation of USBCDC::write(const uint8_t *, size_t) collects this flag, only to discard it without taking any action based on its value, and certainly without calling portYIELD_FROM_ISR() anywhere in the function body.

So, what possible consequences might arise from failing to invokeportYIELD_FROM_ISR() as indicated in the documentation?

https://github.com/espressif/arduino-esp32/issues/6221#issuecomment-1125187189

SuGlider avatar May 17 '22 16:05 SuGlider

@avillacis Let's follow the debug CDC issue here.

SuGlider avatar May 17 '22 16:05 SuGlider

At last I have a minimal reproducer sketch that reliably locks up the ESP32S2 board when using USB CDC as debug output:

#include <Arduino.h>

static void test_taskfunc(void * p)
{
    uint32_t tasknum = (uint32_t)p;

    while (true) {
        log_i("This is task instance %u running...", tasknum);
        
        delay(10 * random(1, 10));
    }
}

void setup()
{
    Serial.begin(115200);
    Serial.setDebugOutput(true);
    delay(2000);

    for (auto i = 0; i < 5; i++) {
        TaskHandle_t th;

        // Created tasks all need to have different priorities in order to trigger lockup
        UBaseType_t taskPrio = 1 + 2 + i;

        if (pdPASS == xTaskCreate(test_taskfunc, "test_taskfunc", 4096, (void *)(i + 1), taskPrio, &th)) {
            log_i("Task instance %u created with priority %u", i+1, 3 + i);
        } else {
            log_e("Task instance %u creation FAILED", i+1);
        }
    }
}

void loop()
{
    log_i("This is the main loop running...");

    delay(10 * random(1, 10));
}

Note that the sketch creates multiple tasks with different priorities. If all created tasks share the same priority, the output gets garbled but the sketch keeps running along. However, the sketch as shown above locks up the board as soon as the /dev/ttyACM0 (Linux) character device is opened for monitoring using any terminal program:

[  2758][I][test-esp32s2-multiple-tasks.ino:8] test_taskfunc(): This is task instance 1 running...
[  2764][I][test-esp32s2-multiple-tasks.ino:27] setup(): Task instance 1 created with priority 3
[  2770][I][test-esp32s2-multiple-tasks.ino:8] test_taskfunc(): This is task instance 2 running...
[  2776][I][test-esp32s2-multiple-tasks.ino:27] setup(): Task instance 2 created with priority 4
[  2783][I][test-esp32s2-multiple-tasks.ino:8] test_taskfunc(): This is task instance 3 running...
[  2785][I][test-esp32s2-multiple-tasks.ino:27] setup(): Task instance 3 created with priority 5
[  2791][I][

Also note that task creation itself without monitoring does NOT trigger the lockup. If the device is left alone without opening the character device, all tasks are created and run normally. But if the character device is then opened, the device locks up again:

$ miniterm.py /dev/ttyACM0 115200
--- Miniterm on /dev/ttyACM0  115200,8,N,1 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
[  8361][I][test-esp32s2-multiple-tasks.ino:8] test_taskfunc(): This is task instance 2 running...
[  8368][I][test-esp32s2-multiple-tasks.ino:36] loop(): Thi[  837

avillacis avatar May 18 '22 17:05 avillacis

Also note, the lockup requires the use of the log_X functions. If every log_X is replaced by direct Serial.printf calls, the lockup does not happen, even when monitored.

avillacis avatar May 18 '22 17:05 avillacis

OK, I'll look into it and try to find out what is the issue here.

SuGlider avatar May 18 '22 18:05 SuGlider

Still happening on just-released Arduino-ESP32 2.0.4

avillacis avatar Jul 06 '22 20:07 avillacis

Ditto, Have to throttle USBSerial tx or else serial stops and or board freezes.

Have to disable debugoutput altogether or I get 2 lines and it stops

tablatronix avatar Sep 09 '22 16:09 tablatronix

Still happening on just-released Arduino-ESP32 2.0.5

avillacis avatar Sep 19 '22 21:09 avillacis

I'll investigate it...

SuGlider avatar Sep 20 '22 15:09 SuGlider

@avillacis @tablatronix Please try the Fix on PR #7284

SuGlider avatar Sep 21 '22 10:09 SuGlider

@avillacis @tablatronix Please try the Fix on PR #7284

The lockup I was experiencing is technically "fixed" with the PR. The strings themselves appear truncated, but the rest appears when the program outputs more data using log_X functions.

avillacis avatar Sep 21 '22 17:09 avillacis

Looks much better, no lockup , wifi works, I see all logging as far as I can tell. Not seeing anything truncated, ill try turning on more debugging

tablatronix avatar Sep 21 '22 18:09 tablatronix

BUMP @me-no-dev @SuGlider we need traction on this issue

tablatronix avatar Oct 04 '22 20:10 tablatronix

@tablatronix still into investigation of the cause. @SuGlider is on vacation currently and will get back on it next week.

me-no-dev avatar Oct 05 '22 07:10 me-no-dev

Bumpity bump bump

tablatronix avatar Oct 23 '22 13:10 tablatronix

Hit this bug on a project this weekend. Can confirm https://github.com/espressif/arduino-esp32/pull/7284 fixes things.

KishCom avatar Nov 13 '22 17:11 KishCom

I can confirm it too!

paclema avatar May 19 '23 18:05 paclema

applied for both, v2.X and also v3.0.

VojtechBartoska avatar Dec 20 '23 15:12 VojtechBartoska