Fix: Properly Handle NAK Response in MAX3421E driver
Description
This pull request addresses an issue in the MAX3421 driver where NAK responses from the device were not handled correctly. The MAX3421 has 2x SNDFIFO of 64 bytes. If an OUT transfer receives a NAK response, the current implementation fills the FIFO again and writes to the SNDBC register, which switches to the other FIFO area and causes the earlier FIFO space to become stale. After writing to the HXFR register on the second try, the SIE will send the FIFO filled in the first attempt. From this point, the SIE will always transfer the data in the opposite FIFO space. When the controller receives a second NAK response for another OUT packet, both FIFO spaces will become full, and the SNDBAVIRQ IRQ will never trigger, causing the controller to become completely stuck.
The following changes were made:
-
Endpoint Structure Update:
- Added a
received_nakboolean field to themax3421_ep_tstructure to keep track of whether a NAK was received in the previous attempt.
- Added a
-
Transmission Logic Update:
- Updated the
xact_outfunction to skip re-writing to the FIFO and SNDBC register if a NAK was received previously, thus preventing unnecessary operations and re-transmissions.
- Updated the
-
NAK Handling:
- Modified the
handle_xfer_donefunction to set thereceived_nakflag when a NAK is encountered and clear it on successful transmissions. - Adjusted the retry logic to respect the
received_nakflag, ensuring proper handling of NAKs and avoiding redundant operations.
- Modified the
These changes ensure that NAK responses are properly managed, improving the reliability and efficiency of data transfers.
Testing
- Can be tested by sending a long text (depending on the device, 64-1024 bytes) to a USB CDC serial device.
- Tested with various USB devices to ensure correct handling of NAK responses.
- Verified that data transfers complete successfully without unnecessary re-transmissions.
- Ensured no regressions in existing functionality.
thank you very much for bringing up and fixing the issue. I have done some testing and it does address the case for spamming CDC OUT. However, It also introduce another If my though right. Let say we have a device with 2 bulk out e.g cdc + msc, when scheduling 2 xfer
- CDC got NAK --> flag sset
- driver switch to MSC Out, make a couple of transfer
- back to CDC bulk out, at this point: flag is set, but data is not available (overwritten by MSC out).
I am making some adjustment, hopefully we could fix this for good.
PS: There is also case where there are 3 Out endpoints are queued, the first 2 are NAKed, then hcd move to the 3rd and both fifo is full. In this case we need to flush the fifo, however the max3421e say nothing about flushing SNDFIO, maybe we just write to SNDBC and FIFO igoring SNDBAVIRQ, which hopefully just overwrite fifo ?
It was a pleasure for me to work on this, I was in the process of developing a usb dongle acting as a serial to ble bridge for connecting a wireless pendant to the machine and your library came quite handy.
During my testing with chips on both revision 2 (0x13) and 3 (0x13) I wasn't able to reset the send FIFO. Also the programming guide does not mention anything regarding that's possible. This issue has been discussed in their support forums and some other places. Unfortunately it does not allow you to overwrite it before sending the buffer.
(Misdesign 5 on the following link for example.) https://electronics.stackexchange.com/questions/372236/bugs-in-maxim-chip-max3421e-used-as-usb-host
thank for more information. Regarding flushing fifo, I tried and I think we can just ignore the SNDBAVIRQ to overwrite sndfifo. I have update PR to add sndfifo ownver info which is invalidate on
- a switch to another OUT endpoint
- a successful OUT transfer.
I figured out that we only care about the owner of data in the fifo, we don't really need to pay attention when it is NAKed, or switching and having an successful IN transfer at all. Though when switching between multple OUT endpoint with NAK, we unforutnately need to rewrite FIFO, but it is indeed a correct way to do since host always need to do round-robin management, it cannot repeately retry NAK on an endpoint until done (which has no ETA).
Please try again to see if this work on your setup.
sorry for the late response, I was busy with some other stuff. your version seems to be working on my setup as well. thanks
No problem, we are all busy. Thank you for confirmation