tinyusb icon indicating copy to clipboard operation
tinyusb copied to clipboard

Unplug from hub on RP2040 using PIO-USB fails with a broken unplug event

Open harbaum opened this issue 2 months ago • 12 comments

Operating System

Linux

Commit SHA

8f2e3ed4418a08cc13aee4527b7bdd1b8bb1bf55, current master

Board

Pi Pico 1 / RP2040

Firmware

Custom firmware for FPGA Companion https://github.com/MiSTle-Dev/FPGA-Companion

Running inside FreeRTOS, using PIO-PICO-USB, also latest master

This is actually extensively used for the MiSTle retro gaming project, where TinyUSB is used on a rp2040 to control keyboard, mouse and joysticks in a FPGA retro gaming setup.

What happened ?

Sometimes (with debugging enabled) or very often (without debugging), unplug events on a CH334 hub are missed and the stack goes nuts as it still tries to communicate with the now missing devices.

How to reproduce ?

Use the FPGA Companion on a MiSTle setup and unplug a device.

Debug Log as txt file (LOG/CFG_TUSB_DEBUG=2)

The HUB xfer callback for the unplug event fails:

[2] Claimed EP 0x81
  Queue EP 81 with 64 bytes ... 
OK
[:5] on EP 81 with 0 bytes: FAILED
  HUB xfer callback
[5] Claimed EP 0x81
  Queue EP 81 with 1 bytes ... 
OK

Afterward, the stack goes nuts as it didn't get the unplug event and still tries to communicate with the now missing devices.

If I ignore the test for a good result in https://github.com/hathach/tinyusb/blob/8f2e3ed4418a08cc13aee4527b7bdd1b8bb1bf55/src/host/hub.c#L364 like so

  /*  if (result == XFER_RESULT_SUCCESS) */ {

Then, the broken message still contains enough information to actually handle the unplug event correctly. The status change byte is especially correct:

[:5] on EP 81 with 0 bytes: FAILED
  HUB xfer callback
  Processing failed hub cb, anyways
  Hub Status Change = 0x08
HUB Get Port Status: addr = 5 port = 3
[1:5] Class Request: A3 00 00 00 03 00 04 00 
[:5] on EP 00 with 8 bytes: OK
[:5] on EP 80 with 4 bytes: OK
[1:5] Control data:
  0000:  00 01 01 00                                      |....|
[:5] on EP 00 with 0 bytes: OK

Everything works fine, now. This is ok for now, but this doesn't sound like a good solution.

How would I debug this further? The hub is soldered on-board, so I cannot simply test a different one.

I'll try to set up a bare-bones system with a different hub. I'll also try to set up a variant that uses the RP2040's native USB vs. the PIO-USB we are currently using (and which we'd like to continue using, as PIO can be used through the pin headers).

Screenshots

No response

I have checked existing issues, discussion and documentation

  • [x] I confirm I have checked existing issues, discussion and documentation.

harbaum avatar Oct 10 '25 20:10 harbaum

there is no way for me to reproduce this. Did the issue occur with standard PICO + ch334 hub breakout https://www.adafruit.com/product/5997?srsltid=AfmBOorxxg4urH2TaKiIJxCo9qGPV8zi_PXtvnmkGWHy3I9VnrJ7Jd9f

hathach avatar Oct 11 '25 00:10 hathach

I am aware that it's not easy to reproduce. I am not expecting you to debug this. But you may have some ideas and hints.

I'll try to reproduce this with a minimal setup and also see if different hubs or the rp2040's native port make a difference.

I'll report here about my findings.

harbaum avatar Oct 11 '25 08:10 harbaum

please also try to reproduce it with standard pico + ch334 hub to rule out if it is hardware related issue as well.

hathach avatar Oct 11 '25 08:10 hathach

Some more results:

  • the same happens in a minimal software setup, basically only running tinyusb on freertos
  • the same happens with an off-the-shelf Pi-Pico with an off-the-shelf CH334 hub
  • the same happens with a different hub

The same does not happen if I use the rp2040's native USB. So this may actually be a problem with pio-usb rather than the tinyusb stack itself.

Interestingly, some devices also don't enumerate anymore since I updated from the current pico-sdk to the latest tinyusb and pio-usb.

I'll do some more research.

harbaum avatar Oct 13 '25 09:10 harbaum

Interestingly, the enumeration issues that have come with the latest tinyusb/pio-usb only happens with a CPU clock < 156Mhz. At higher CPU speeds, these are gone. However, the hub/unplug issues seem to be unaffected by this.

harbaum avatar Oct 13 '25 10:10 harbaum

This may be related/the same issue:

https://github.com/sekigon-gonnoc/Pico-PIO-USB/issues/149

harbaum avatar Oct 13 '25 12:10 harbaum

This is pio timing/race condition. I fixed/improved a similar issue here https://github.com/sekigon-gonnoc/Pico-PIO-USB/pull/186 . Basically CPU does not reponse fast enough within USB timing specs and peer consider it as a timeout. Pleasethe stock example host/device_info on bare metal instead of freertos. I have never tried rp2040 with freertos, last time I tried, and make an PR to pico-sdk but it got zero feedback.

hathach avatar Oct 13 '25 13:10 hathach

I haven't tested without freertos. But with my minimal test setup it shouldn't be too complex to do.

Unfortunately that won't be a solution for my application. But I'll give it a try, anyways.

harbaum avatar Oct 13 '25 14:10 harbaum

Removing FreeRTOS does not make a difference. Also, the HUB doesn't seem to matter and the device being plugged also doesn't matter.

What matters is whether the device being unplugged is actually being polled. This is the case for all of my HID devices, and thus this happens quite reliably for any HID device I unplug. This does not happen if I don't poll for HID reports.

I'd actually assume the HID Controller example would expose the same problem.

harbaum avatar Oct 14 '25 09:10 harbaum

Removing FreeRTOS does not make a difference. Also, the HUB doesn't seem to matter and the device being plugged also doesn't matter.

What matters is whether the device being unplugged is actually being polled. This is the case for all of my HID devices, and thus this happens quite reliably for any HID device I unplug. This does not happen if I don't poll for HID reports.

I'd actually assume the HID Controller example would expose the same problem.

which examples you are testing with host/device_info or host/hid_controller or host/cdc_msc_hid ?

hathach avatar Oct 14 '25 10:10 hathach

I am still using my own code, which is now reduced to the bare minimum and has actually less functionality left than any of the official examples.

There are a few things I do see happen. While tinyusb is retrying for the HUB unplug event, it keeps generating 0 byte application callbacks for the HID reports. My code immediately requests to receive the next report, just like the hid_controller demo does at https://github.com/hathach/tinyusb/blob/38255ffc3879e9aef12796ce25b58a2c654ce67d/examples/host/hid_controller/src/hid_app.c#L317

In case of the device already being unplugged this causes fast retries which in turn seem to be part of the problem.

I have yet to understand the HID report part fully. Is tinyusb handling the interrupt request rate itself? Or should the application handle this in the report callback? How is the interrupt rate to be handled in case of USB communication errors?

Also, some devices seem to cause frequent 0 byte callbacks, while others don't.

I just tried to only request another report if the callback was called with a non-zero-length. This means that once the device is unplugged and a 0-byte callback happens, no further report is requested. This indeed seems to solve the problem. But that IMHO means that any problem, even intermittent ones, will completely stop the report polling and thus stop the device from working.

It seems to me that the callback currently cannot distinguish between the different problem cases. In case of e.g. a NAK it may definitely make sense to continue to poll at the full rate. But in case of a timeout, it may make sense to reduce the polling rate or even stop polling at all. Another solution may be to stop retries on a downlink device while communication with the hub itself is running a retry? I mean if communication with the hub is in trouble it doesn't sound like a good idea to actually continue to communicate over it?

Is tinyusb flooding the unplugged device too much to keep the communication with the hub working?

harbaum avatar Oct 14 '25 11:10 harbaum

I just found my device that would send zero-length reports under normal circumstances. It's a competition pro USB joystick. This behavior is in the app callback indistinguishable from errors. I will open a separate issue on the topic of HID error replies.

But I also see an issue in the hub error handing here. Missing/lost hub reports are potentially critical. A missed unplug event is one example. Such errors would potentially break the entire application, break file transfers etc. I'd suggest reacting to hub report failures as if the entire hub is lost, and treat all sub-devices as being unplugged. Also, the number of retries may be increased for hubs, as their correct operation is rather critical.

harbaum avatar Oct 15 '25 14:10 harbaum