Chrysalis icon indicating copy to clipboard operation
Chrysalis copied to clipboard

Chrysalis firmware update fails + causes confusion when udev rules are not installed

Open parke opened this issue 2 years ago • 6 comments

If the udev rules are not installed, and you use Chrysalis to (try to) update the firmware on a Keyboardio Atreus, Chrysalis will (incorrectly??) report that there was an error flashing the firmware.

To reproduce:

  1. Boot into an Ubuntu 20.04 installation live session.
  2. Plug in the Atreus keyboard.
  3. Run sudo chmod 666 /dev/ttyACM0
  4. Download Chrysalis version 0.8.4.
  5. Run Chrysalis.
  6. Click the "CONNECT" button to connect to /dev/ttyACM0. (Chrysalis will not ask about udev rules because we already ran chmod.
  7. Open the hamburger menu and select "Firmware Update".
  8. Hold down the lower left key (ESC by default) and click "Update".
  9. Steps 1 & 2 & 3 of the firmware update will appear to complete successfully.
  10. Chrysalis will then say at the top: "Your computer won't let Chrysalis talk to your keyboard. [snip]", and, in the lower left hand corner: "Error flashing the firmware TROUBLESHOOTINGDISMISS".

My goal is to be able to use Chrysalis to update the Atreus firmware without error messages, and without installing udev rules.

parke avatar Aug 02 '21 01:08 parke

My goal is to be able to use Chrysalis to update the Atreus firmware without error messages, and without installing udev rules.

I'm quite certain that the only way to accomplish that is for the user running Chrysalis to be in the same group as ttyACM* devices by default are (usually dialout or similar). Reason being is that the flashing process involves rebooting the keyboard into bootloader/programmable mode, which will (re-)create the ttyACM device, so the chmod 666 you did earlier won't be in effect anymore.

The flashing process basically boils down to rebooting the keyboard, which will stay in bootloader/programmable mode if the lower left key is held during boot. Once rebooted, we do the real flashing, and at the end of that, reboot the keyboard again. At that point, with the lower left key not held anymore, it boots into the freshly flashed firmware.

As such, for the flashing to work, Chrysalis needs to be able to access the device after rebooting it. It will be a new device file, not the same as you initially connected to. Similarly, after flashing, it will be a third device file. It may - and often does - share the name, but it gets re-created twice during flashing due to the device reboots.

The problem in this case is that the flashing appears to succeed, which it likely did not. Can you go into System Information, click the Create Bundle button, and share the created zip file? It will contain the debug logs, which helps us figure out whether the flash succeeded or not, and if it did not, why Chrysalis would make it appear it did, even temporarily. (It does say it failed flashing in the lower left hand corner, so that part appears to work as expected)

algernon avatar Aug 02 '21 09:08 algernon

Thank you for the detailed explanation of the firmware update process.

IMO: If installing the udev rules is optional (and I definitely prefer it to be optional), then Chrysalis should perform gracefully when the udev rules are not installed. This would include some ability upgrade the firmware (albeit perhaps with user assistance). Or (less preferably) Chrysalis should refuse to even attempt to upgrade the firmware if the udev rules are not installed.

Fyi, I ran the following (as root) and was able to successfully flash my Atreus.

while true ; do sleep 0.1 ; chmod 666 /dev/ttyACM0 ; done

If you want to improve the current firmware update process and interface, then I believe it might be worth considering a select subset of the following changes:

  1. Detect whether or not the udev rules are installed (or appear to be installed) before rebooting the keyboard.

  2. Insert an additional test reboot at the beginning solely to verify that Chrysalis will have write access to the keyboard after it reboots.

  3. If the keyboard device is not writable after the reboot, prompt the user to manually chmod the device. (How long does the keyboard stay in bootloader mode?)

  4. Change the "Waiting for boatloader" progress step to wait until the device exists AND is writable.

  5. If the device is not writable after the reboot, print a single but more detailed error message, rather than the two disjointed error messages that are printed as of version 0.8.4. Preferably, this error message would be printed on the Firmware Update page. The current behavior, where the entire Firmware Update page disappears, and Chrysalis itself goes back to the "Select a Keyboard" page, is confusing. IMO, the Firmware Update page should remain displayed until the user closes the error notification. So, for example: "Firmware update failed. Please click continue to attempt to reconnect to your keyboard."

If you would like me to elaborate on any of the above options, please let me know.

FYI, I have also discovered the following: If, after the reboot, the keyboard device is not writable, then the keyboard is also non-functional (i.e., I cannot type on it, until I unplug and replug it). Presumably, this is because the keyboard is stuck in bootloader mode. I did not realize this previously as I was working on a system with multiple keyboards, and the Atreus was not the primary keyboard I was using. However, on my current system, the Atreus is the only keyboard attached.

parke avatar Aug 02 '21 16:08 parke

IMO: If installing the udev rules is optional (and I definitely prefer it to be optional), then Chrysalis should perform gracefully when the udev rules are not installed.

It most definitely is optional, as there are a number of other ways to have Chrysalis access to the device (such as running as root [not recommended, obviously], having the running user in the same group as the device file if it's group writeable, or doing something like you did).

This would include some ability upgrade the firmware (albeit perhaps with user assistance).

This would be possible, yes, but it does make the code a lot more complicated, for a fairly rare use case.

Or (less preferably) Chrysalis should refuse to even attempt to upgrade the firmware if the udev rules are not installed.

If anything, I'd go down this route. Sadly, it's not 100% reliable possible to detect the udev rules in advance.

  1. Detect whether or not the udev rules are installed (or appear to be installed) before rebooting the keyboard.

This is not possible, not without rebooting the keyboard. We can detect if our rules are installed, but if someone has rules that are similar, but slightly different in shape, we can't. There are a whole lot of ways to make Chrysalis have access to the device with udev, and we can't reliable detect all of them. Not having reliable detection pretty much makes the exercise pointless, as we'd end up adding corner case after corner case.

  1. Insert an additional test reboot at the beginning solely to verify that Chrysalis will have write access to the keyboard after it reboots.

We might be able to do this, but it's complicated. We can't rely on the device coming back up with the same name, for one. And if it doesn't, it's hard to detect if its the same keyboard, and not just another keyboard of the same type (I have two Model01s plugged in most times: one I type with, another that I work with), so we'll need a way to make sure that whatever comes back up, is the one we rebooted. We can put a nonce in EEPROM, and ask for it after reboot. But the keyboard is unusable 'till then, and the process can take a good couple of seconds. It also complicates the update instructions about when to hold a particular key.

And on top of it all, the flashing process is already on the fragile and complicated side, this would make it considerably more so.

  1. If the keyboard device is not writable after the reboot, prompt the user to manually chmod the device. (How long does the keyboard stay in bootloader mode?)

Now this is something we can detect during the process, and stop and pause there. That's a whole lot simpler than the pre-detection steps above, and would still give the user time to manually chmod. As far as I can tell (see here), the device stays in bootloader mode for about 30 seconds. That's... quite short for an unprepared user to do anything. Even to read a "Can I sudo chmod the device for you?" dialog, let alone click it and type their password in.

What we can do, and is more user friendly, is to detect this case, abort the flashing immediately, and report it, with ample explanation to how to fix it: by installing the udev rule or similar, running Chrysalis with a user in a particular group (if we detect that'd help us), or running chmod in loop. Probably pointing to a wiki page or documentation after a brief summary would be best.

Thus, instead of "flashing failed", and little information about why, you'd get a much better explanation, and possible ways to fix it. We wouldn't do any of it automatically, but we'd present a friendlier user experience for this particular failure case.

  1. Change the "Waiting for boatloader" progress step to wait until the device exists AND is writable.

We can do this, yup.

  1. If the device is not writable after the reboot, print a single but more detailed error message, rather than the two disjointed error messages that are printed as of version 0.8.4. Preferably, this error message would be printed on the Firmware Update page.

I see we pretty much ended up with similar ideas in the end! :D

The current behavior, where the entire Firmware Update page disappears, and Chrysalis itself goes back to the "Select a Keyboard" page, is confusing. IMO, the Firmware Update page should remain displayed until the user closes the error notification. So, for example: "Firmware update failed. Please click continue to attempt to reconnect to your keyboard."

That's something we can't do with the current Chrysalis architecture. It is confusing, yes. But for some pages to work, the keyboard needs to be connected to, and we can't auto-reconnect after the reboot, not without being able to detect it's the same keyboard. (See the nonce idea above, for the reasons why, and how we might overcome that, eventually.)

We can take the user to a "Firmware update failed" screen, with more detailed explanation about the why, instead of showing the select screen, though. Still a different page than what they started of from, but more appropriate than the select screen + error toast.

FYI, I have also discovered the following: If, after the reboot, the keyboard device is not writable, then the keyboard is also non-functional (i.e., I cannot type on it, until I unplug and replug it). Presumably, this is because the keyboard is stuck in bootloader mode. I did not realize this previously as I was working on a system with multiple keyboards, and the Atreus was not the primary keyboard I was using. However, on my current system, the Atreus is the only keyboard attached.

The keyboard should fall back to booting the old firmware after about 30 seconds. But yes, it will be stuck in bootloader mode until then, and will not be usable. Since the device is not writable, we can't ask the bootloader to boot the firmware (not sure if we could, even if it'd be writable), nor can we detach the USB device to force a reset (also unsure if that'd have the desired effect).

algernon avatar Aug 02 '21 17:08 algernon

  1. Insert an additional test reboot at the beginning solely to verify that Chrysalis will have write access to the keyboard after it reboots.

We might be able to do this, but it's complicated. We can't rely on the device coming back up with the same name, for one. And if it doesn't, it's hard to detect if its the same keyboard, and not just another keyboard of the same type (I have two Model01s plugged in most times: one I type with, another that I work with), so we'll need a way to make sure that whatever comes back up, is the one we rebooted.

If you ​have:

  • N keyboards attached, and then
  • the user starts a firmware upgrade, and then
  • keyboard A disappears, and then
  • keyboard B appears (with a different name)
  • then it seems to me it is safe to proceed under the assumption that B is the rebooted-A

If B is not A, then B will not be in bootloader mode, so no damage should be done.

There may be some cases where a user would plug in a new keyboard during the middle of doing a firmware upgrade, but such cases are indeed fringe, IMO.

​We can put a nonce in EEPROM, and ask for it after reboot. But the keyboard is unusable 'till then, and the process can take a good couple of seconds. It also complicates the update instructions about when to hold a particular key.

IMO, even if the udev rules are installed, the instructions about when, and for how long, to hold the key could be improved. For example, at present: I have no idea how long I need to hold the key down for. Until the keyboard starts shutting down? Until some point in the keyboard coming back up? Until the flash is complete? Let me know if you want me to open a GitHub issue about this ambiguity.

That's something we can't do with the current Chrysalis architecture. It is confusing, yes. But for some pages to work, the keyboard needs to be connected to, and we can't auto-reconnect after the reboot, not without being able to detect it's the same keyboard. (See the nonce idea above, for the reasons why, and how we might overcome that, eventually.)

I think it would be nice if a keyboard could report its serial number. (Or, failing a serial number, a randomly generated UUID.) And also if a keyboard could report the version of the firmware it is currently running. (I think I asked about reporting firmware versions via email a while back and was told that the firmware does not know/store its version number.)

parke avatar Aug 02 '21 21:08 parke

If B is not A, then B will not be in bootloader mode, so no damage should be done.

But if A was in bootloader mode to begin with, we can't figure out whether A or B is the one we want, unless we keep track of all devices, at all times, and that's just a huge fragile pain in the backside to say the least (been there, tried it, it was very quickly reverted). The only reliable way I'd be comfortable with is if the keyboard assisted, and would be able to tell Chrysalis that it's the one we want. That needs a bit of cooperation from the bootloader too, and is pretty much out of scope for the Atreus and the Model01.

For example, at present: I have no idea how long I need to hold the key down for. Until the keyboard starts shutting down? Until some point in the keyboard coming back up? Until the flash is complete?

The current description is a compromise between compactness and exact correctness. Technically, you need to hold the key down until it reboots into bootloader mode. For some keyboards, that's fairly easy to notice: the Model01 will start lighting up LEDs in a red pattern when it starts to flash, which is a safe time to release the key, and the instructions are, thus, worded accordingly. There's no such thing on the Atreus, there's no easy way to notice the keyboard rebooting apart from trying to type on it, or watching USB events - there's no indication on the keyboard itself, and the OS usually doesn't make it very obvious either. The closest thing we could say in the description is "hold it until flashing starts".

On the other hand, you did give me an idea: we can add a message to the flashing screen once the reboot is done, and we start flashing! We can then display a message telling the end-user that they can release the key now.

I think it would be nice if a keyboard could report its serial number.

Some kind of unique id would be nice, yes. It's a tough thing, though. For a number of reasons, it is not practical to burn a serial id into every keyboard as they come out of the factory. So the best we could do is assign an id and store it in EEPROM on first use with Chrysalis. That takes ~4 bytes of EEPROM space though, and EEPROM space is precious when we only have 1k of it.

We could do tricks like generating an UUID for every keyboard Chrysalis sees, and when we flash firmware shipped with Chrysalis, modify it to include the UUID in progmem.... but that's a tad awkward and fragile, all for little gain, since we'd ideally want the bootloader to be able to report it too, which needs bootloader cooperation, and we don't have enough space in the bootloader to add this, not for the Atreus and the Model01.

And also if a keyboard could report the version of the firmware it is currently running.

...that's also a bit more complicated than one might think, as the firmware is built of a number of components: Arduino, Kaleidoscope, and the firmware sketch itself, to name a few. We can assign a version number to the firmware built for a particular Chrysalis release, and can report it, and can do the same for the factory firmware too, as for all of those, we can track the Arduino, Kaleidoscope and Sketch versions and tie them to the overall firmware version. For custom firmware, it's a whole different story.

I can reasonably easily make Chrysalis display the firmware version (if available), but I think it's less useful than one might think.

algernon avatar Aug 02 '21 22:08 algernon

If B is not A, then B will not be in bootloader mode, so no damage should be done.

But if A was in bootloader mode to begin with, we can't figure out whether A or B is the one we want, unless we keep track of all devices, at all times, and that's just a huge fragile pain in the backside to say the least (been there, tried it, it was very quickly reverted). The only reliable way I'd be comfortable with is if the keyboard assisted, and would be able to tell Chrysalis that it's the one we want. That needs a bit of cooperation from the bootloader too, and is pretty much out of scope for the Atreus and the Model01.

You are correct that I consider the case where multiple keyboards are in bootloader mode at the same time to be out of scope. But you would know better than I whether that is a case that merits consideration.

[Regarding when to release ESC:] On the other hand, you did give me an idea: we can add a message to the flashing screen once the reboot is done, and we start flashing! We can then display a message telling the end-user that they can release the key now.

Indeed.

I can reasonably easily make Chrysalis display the firmware version (if available), but I think it's less useful than one might think.

Reasons I would want to see the firmware version:

  • To verify that a firmware upgrade succeeded.
  • To know whether or not I need to perform a firmware upgrade.
  • To be able to report the firmware version if I am submitting bug reports about unexpected/errant keyboard behavior.

I know there are multiple different components in the firmware. But from my point of view, the firmware version is the same as the Chrysalis version, as the Chrysalis AppImage files are my only source for firmware upgrades.

parke avatar Aug 02 '21 22:08 parke

Chrysalis now runs in-browser using WebSerial and WebUSB and this functionality has been rewritten, so I'm closing out this issue as obsolete. Please don't hesitate to open a new issue if https://chrysalis.keyboard.io exhibits the same behavior

obra avatar Feb 26 '24 21:02 obra