flipperzero-firmware icon indicating copy to clipboard operation
flipperzero-firmware copied to clipboard

USB: Fix file writing hang regression for big files

Open WillyJL opened this issue 1 year ago • 2 comments

What's new

  • Fixes #3452 as it is still an issue
  • TLDR:
    • #3358 fixed crashes when using cat on flipper serial
    • it also included some unrelated USB CDC parameter changes
    • from what I've tested, the crash for cat on serial is fixed even when not including the USB CDC changes
    • this additional unrelated change was only denoted as USB CDC: switch from 2x unidirectional data endpoints to 1x bidirectional
    • from what I can see there are no additional details, no bugs this is attempting to fix, and no improvements this brings
    • there are however some very real issues with this tiny USB CDC changes
    • during USB file writing (like qFlipper or Web update) there is a random chance of USB lockup
    • when this happens, Flipper USB will be fully unresponsive until rebooted
    • qFlipper will say timeout after some seconds, Web will just remain stuck forever
    • the bigger the file being written over USB, the higher chance of lockup
    • it is most prominent when installing some big CFW (6MB+ resources.tar), but I have seen multiple users struggling to update OFW due to this bug
    • in other cases I told people to downgrade to older OFW to avoid this bug, and they could not even downgrade due to having newer OFW currently installed with this bug
    • CFWs have reverted this changes in same way as this PR and have had no issues since. going from CFW to new CFW, or from CFW to OFW
    • but since OFW after 0.99 has this bug, if user has OFW installed they can have issues updating to newer OFW or installing CFW

Verification

  • Install firmware with big resources.tar multiple times and no issues

Checklist (For Reviewer)

  • [ ] PR has description of feature/bug or link to Confluence/Jira task
  • [ ] Description contains actions to verify feature/bugfix
  • [ ] I've built this code, uploaded it to the device and verified feature/bugfix

WillyJL avatar May 01 '24 04:05 WillyJL

https://github.com/flipperdevices/flipperzero-firmware/pull/3358 actually fixes two problems:

  • dual endpoint configuration causes linux to send garbage to the device
  • line buffer overflow in cli caused by garbage sent to the device

So what you doing here is breaking first one. I'd rather see proper fix that is finding root cause of single endpoint cdc crash.

skotopes avatar May 01 '24 04:05 skotopes

I see, thanks for clearing that up. It's the crucial piece of information that wasn't specified anywhere (atleast as far as I could see).

I'll try to find out more but it is really difficult to debug, there are no weird logs or crashes usually, just stops working. It almost feels like random memory corruption.

WillyJL avatar May 01 '24 04:05 WillyJL

Correct fix that properly covers all cases in #3705

skotopes avatar Jun 12 '24 15:06 skotopes