tinyusb icon indicating copy to clipboard operation
tinyusb copied to clipboard

`cdcd_xfer_cb` callback uses `p_cdc->epout_buf` after endpoint was released

Open deymo opened this issue 3 years ago • 1 comments

This is a concurrency issue when running in a RTOS environment. When receiving data on a device CDC, in short the issue is that both cdcd_xfer_cb() and tud_cdc_n_read() can call _prep_out_transaction(), but the cdcd_xfer_cb() uses the buffer that _prep_out_transaction() repurposed for the next transfer after the endpoint was released, potentially overriding old data in that buffer with new one before it was consumed and added to the fifo.

The _prep_out_transaction() function starts a new OUT transfer if both there's enough space in the rx_ff fifo to hold a whole OUT xfer and it can successfully claim the ep_out endpoint. Claiming an endpoint is done by checking the value of claimed and busy in usbd_edpt_claim(), however the endpoint is released automatically by the DCD_EVENT_XFER_COMPLETE event in tud_task before calling the driver xfer_cb callback, so at any point while cdcd_xfer_cb is handling the transfer that just finished the endpoint is free (not busy nor claimed) but the callback still needs to copy the data out of the transfer buffer (epout_buf).

Consider the following scenario:

  • Two FreeRTOS tasks: one just running tud_task() in a loop, one reading from the only CDC with tud_cdc_read and a large buffer (512 bytes). This is like the cdc_msc_freertos example but with both tasks at the same priority, this difference is important.
  • Host sends large amount of data all together (>2KB, assuming high-speed, 512 OUT transfers at a time).
  • The first OUT transfer in the cdc endpoint is completed, cdcd_xfer_cb is called from the tud_task task, but before it calls tu_fifo_write_n() to add the bytes to the fifo the task is interrupted and the other task runs.
  • The "cdc" reader task calls tud_cdc_n_read() with a large enough buffer which will empty the fifo and start a new transfer in _prep_out_transaction() because both conditions are met (fifo is empty and endp is not claimed).
  • USB hardware ISR or DMA might at any point interrupt and write the new OUT transfer data to epout_buf.
  • the usb task continues running, calls tu_fifo_write_n() with epout_buf which now may contain data from a newer OUT transfer.

In the example mentioned this situation doesn't occur because the task running tud_task has the highest priority of all the tasks calling TU API, I don't know if this condition is required by tinyusb but I didn't see it documented or perhaps I missed that.

This bug was a lot easier to reproduce before commit 7fc99a9e1104703566945ace1b2e1ed59819ba48 since we were calling tu_fifo_write() n times, instead of tu_fifo_write_n one time, so there was a lot more time for things to go wrong, but the race condition still exists. Perhaps relevant for context is to read #507 and #508, however this bug is about the OUT / RX side.

My proposed fix would be to change the xfer_cb() driver api so that the endpoint is not released before calling xfer_cb() and it is responsibility of this function to release the endpoint after it is done using it. This might require changes in all drivers. Another option is to just fix it for cdc (and others that have a similar arrangement like midi) by adding another boolean to the cdcd_interface_t to prevent _prep_out_transaction from starting a new transfer until cdcd_xfer_cb() is done with the buffer. Other ideas?

deymo avatar Jan 20 '22 13:01 deymo

Thank you for your issue, would you mind making an pr to demonstrate the changes and since you are already at it. I could revise this, but I currently have a huge issue with (lack of) time.

hathach avatar Mar 06 '22 04:03 hathach