rancher-desktop icon indicating copy to clipboard operation
rancher-desktop copied to clipboard

Updating certificates fails

Open sascha-andres opened this issue 1 year ago • 35 comments

Actual Behavior

Rancher not starting

Steps to Reproduce

Start Rancher Desktop

Result

2022-08-19T07:27:00.315Z: Registered distributions: Ubuntu-20.04,rancher-desktop-data,rancher-desktop 2022-08-19T07:27:00.570Z: Registered distributions: Ubuntu-20.04,rancher-desktop-data,rancher-desktop 2022-08-19T07:27:01.006Z: Registered distributions: Ubuntu-20.04,rancher-desktop-data,rancher-desktop 2022-08-19T07:27:01.199Z: Registered distributions: Ubuntu-20.04,rancher-desktop-data,rancher-desktop 2022-08-19T07:27:01.199Z: data distro already registered 2022-08-19T07:27:16.466Z: Installing C:\Users\sascha.andres\AppData\Local\Programs\Rancher Desktop\resources\resources\linux\internal\trivy as /mnt/c/Users/sascha.andres/AppData/Local/Programs/Rancher Desktop/resources/resources/linux/internal/trivy into /usr/local/bin/trivy ... 2022-08-19T07:27:16.486Z: Installing C:\Users\sascha.andres\AppData\Local\Programs\Rancher Desktop\resources\resources\linux\internal\rancher-desktop-guestagent as /mnt/c/Users/sascha.andres/AppData/Local/Programs/Rancher Desktop/resources/resources/linux/internal/rancher-desktop-guestagent into /usr/local/bin//rancher-desktop-guestagent ... 2022-08-19T07:27:17.555Z: WSL: executing: /usr/sbin/update-ca-certificates: Error: wsl.exe exited with code 1

Expected Behavior

Rancher Desktop starting and usable

Additional Information

No response

Rancher Desktop Version

1.5.1

Rancher Desktop K8s Version

unknown

Which container engine are you using?

moby (docker cli)

What operating system are you using?

Windows

Operating System / Build Version

Windows 11

What CPU architecture are you using?

x64

Linux only: what package format did you use to install Rancher Desktop?

No response

Windows User Only

No response

sascha-andres avatar Aug 19 '22 07:08 sascha-andres

I found kind of a workaround: a complete factory reset. Not happy with that though

sascha-andres avatar Aug 22 '22 09:08 sascha-andres

Thanks for filing an issue! Is there any chance you have more details about the failure? It seems that a updating CA certificates inside a WSL distro failed, but it is tough to say what the problem is without more info. If you can get the error again, could you please do the following:

  • enable debug mode in the Troubleshooting tab
  • get RD to produce the error
  • upload your logs here (you can get to the folder with the logs via a button in the Troubleshooting tab)

Thanks!

adamkpickering avatar Aug 22 '22 16:08 adamkpickering

Will do so tomorrow (I'm back in the office then)

sascha-andres avatar Aug 22 '22 17:08 sascha-andres

Actually the steps resulted in a completely unusable state:

image

After pressing OK the app is closed. After I stopped wsl completely I could remove the file and start it again.

Logs attached

logs.zip

sascha-andres avatar Aug 23 '22 11:08 sascha-andres

That's fishy. Were you running RD as an administrator by any chance? It must be run as a regular user otherwise weird stuff starts to happen. We have #1560 in progress for this.

adamkpickering avatar Aug 23 '22 22:08 adamkpickering

@adamkpickering sorry for the late reply, was sick. No, I was not using RD as an administrator. We have no administrative rights here

sascha-andres avatar Aug 25 '22 05:08 sascha-andres

So WSL is installed for you? That EPERM exception makes me wonder if IT has your system super locked down. Though I'm not very knowledgeable about Windows... @mook-as what is your take on this?

adamkpickering avatar Aug 26 '22 16:08 adamkpickering

The EPERM can occur if you somehow managed to start Rancher Desktop twice (because the previous instance has the file open, the new instance can't delete it).

It's unclear why running update-ca-certificates is failing, though; wsl-exec.log shows:

run-parts: /etc/ca-certificates/update.d/certhash: exit status 132

But it's unclear why that's happening. Would you be able to (once the failure has occurred) manually run update-ca-certificates in the rancher-desktop WSL distribution, and dig into the errors there?

mook-as avatar Aug 26 '22 17:08 mook-as

I wanted to add that I am having the same issue as well: Actual Behavior Rancher not starting after reboot

Steps to Reproduce Start Rancher Desktop. Reboot the host machine while rancher is on. Try to start it again.

Result 2022-10-06T15:10:26.652Z: Registered distributions: rancher-desktop,rancher-desktop-data 2022-10-06T15:10:27.441Z: Registered distributions: rancher-desktop,rancher-desktop-data 2022-10-06T15:10:34.509Z: Registered distributions: rancher-desktop,rancher-desktop-data 2022-10-06T15:10:35.128Z: Registered distributions: rancher-desktop,rancher-desktop-data 2022-10-06T15:10:35.129Z: data distro already registered 2022-10-06T15:10:40.745Z: Did not find a valid mount, mounting /mnt/wsl/rancher-desktop/run/data 2022-10-06T15:11:39.074Z: Installing C:\Users\kyle.andrews\AppData\Local\Programs\Rancher Desktop\resources\resources\linux\internal\rancher-desktop-guestagent as /mnt/c/Users/kyle.andrews/AppData/Local/Programs/Rancher Desktop/resources/resources/linux/internal/rancher-desktop-guestagent into /usr/local/bin//rancher-desktop-guestagent ... 2022-10-06T15:11:39.418Z: Installing C:\Users\kyle.andrews\AppData\Local\Programs\Rancher Desktop\resources\resources\linux\internal\trivy as /mnt/c/Users/kyle.andrews/AppData/Local/Programs/Rancher Desktop/resources/resources/linux/internal/trivy into /usr/local/bin/trivy ... 2022-10-06T15:11:48.747Z: WSL: executing: /usr/sbin/update-ca-certificates: Error: wsl.exe exited with code 1

Expected Behavior Rancher Desktop starting and usable

Additional Information No response

Rancher Desktop Version 1.5.1

Rancher Desktop K8s Version unknown

Which container engine are you using? moby (docker cli) or containerd

What operating system are you using? Windows

Operating System / Build Version Windows 10 Enterprise

What CPU architecture are you using? x64

Linux only: what package format did you use to install Rancher Desktop? No response

Windows User Only No response

cyron7 avatar Oct 06 '22 15:10 cyron7

This is the error I get as rancher is starting up: image

cyron7 avatar Oct 06 '22 15:10 cyron7

Recent Log file lines: 2022-10-06T15:39:47.467Z: Running: wsl.exe --distribution rancher-desktop --exec busybox chmod 755 /etc/init.d/dnsmasq-generate 2022-10-06T15:39:47.649Z: Running: wsl.exe --distribution rancher-desktop --exec busybox chmod 644 /etc/conf.d/cri-dockerd 2022-10-06T15:39:47.913Z: Running: wsl.exe --distribution rancher-desktop --exec busybox chmod 644 /etc/conf.d/containerd 2022-10-06T15:39:48.007Z: Running: wsl.exe --distribution rancher-desktop --exec busybox chmod 644 /etc/logrotate.d/k3s 2022-10-06T15:39:48.624Z: Running: wsl.exe --distribution rancher-desktop --exec /sbin/rc-update add host-resolver default 2022-10-06T15:39:49.001Z: WSL: executing: /usr/sbin/update-ca-certificates: Error: wsl.exe exited with code 1 2022-10-06T15:39:49.479Z: Running: wsl.exe --distribution rancher-desktop --exec mkdir -p /etc/cni/net.d 2022-10-06T15:39:49.548Z: Capturing output: wsl.exe --distribution rancher-desktop --exec wslpath -a -u C:\Users\kyle.andrews\AppData\Local\Temp\rd-docker-7MqAJp\docker 2022-10-06T15:39:49.811Z: Running: wsl.exe --distribution rancher-desktop --exec /sbin/rc-update add dnsmasq default

cyron7 avatar Oct 06 '22 15:10 cyron7

wsl-exec.log

cyron7 avatar Oct 06 '22 15:10 cyron7

/etc/ca-certificates/update.d/certhash: exit status 132

# cat /etc/ca-certificates/update.d/certhash
#!/bin/sh
exec /usr/bin/c_rehash /etc/ssl/certs

Alpine has their own c_rehash; it's not clear how that can exist with 132 — it looks like it returns either 2 or 0. It's also possible it's actually (128 + 4), in which case it's SIGILL… which also doesn't make much sense.

Please try running /usr/bin/c_rehash -v /etc/ssl/certs manually and see if it produces more details?

Out of curiosity, what CPU do you have? Not that I really expect a missing instruction there… The other option I see is corruption (either disk image or memory).

mook-as avatar Oct 06 '22 19:10 mook-as

See log for results of command '/usr/bin/c_rehash -v /etc/ssl/certs' run from within 'rancher-desktop'

c_rehash.log

cyron7 avatar Oct 06 '22 19:10 cyron7

CPU: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 2112 Mhz, 4 Core(s), 8 Logical Processor(s)

cyron7 avatar Oct 06 '22 19:10 cyron7

Illegal instruction

Well, that's interesting! c_rehash (for 1.5.1) has a sha256 hash of 3AD730F1AE440CAE63D0C4E5EECFB3A69318E5FABE2E10B16BBCDA81735A8E7C for me. Is yours any different?

That CPU shouldn't be missing any actual instructions, as far as I know…

mook-as avatar Oct 06 '22 21:10 mook-as

@cyron7 Could you run /usr/bin/openssl rehash -v /etc/ssl/certs to see if it fails the same way?

Also, could you attach the output of cat /proc/cpuinfo? Your CPU should have all required features, so I don't really understand how this can happen.

jandubois avatar Oct 06 '22 23:10 jandubois

@jandubois ; I am away from that machine. I should be able to get that output to you in the next 24 hours.

cyron7 avatar Oct 06 '22 23:10 cyron7

@mook-as ; I'm not sure how to check that. I'll to to figure that out and get back with you.

cyron7 avatar Oct 06 '22 23:10 cyron7

I'm not sure how to check that. I'll to to figure that out and get back with you.

Some options:

lima-rancher-desktop:~$ sha256sum /usr/bin/c_rehash
3ad730f1ae440cae63d0c4e5eecfb3a69318e5fabe2e10b16bbcda81735a8e7c  /usr/bin/c_rehash
lima-rancher-desktop:~$ openssl dgst -sha256 /usr/bin/c_rehash
SHA256(/usr/bin/c_rehash)= 3ad730f1ae440cae63d0c4e5eecfb3a69318e5fabe2e10b16bbcda81735a8e7c

@mook-as shows an uppercase hash, so not sure which command he ran. 😄

jandubois avatar Oct 06 '22 23:10 jandubois

@jandubois ; It looks like that is missing: image

cyron7 avatar Oct 07 '22 12:10 cyron7

@jandubois ; Here is the information from cpuinfo: cpuinfo.log

cyron7 avatar Oct 07 '22 12:10 cyron7

@jandubois and @mook-as ; This was the output from running the rehash: image

Hash: 3ad730f1ae440cae63d0c4e5eecfb3a69318e5fabe2e10b16bbcda81735a8e7c /usr/bin/c_rehash

cyron7 avatar Oct 07 '22 12:10 cyron7

Here is the information from cpuinfo:

Thanks! That all looks as expected (VM settings match your host CPU), and all the features like sse4_2 and avx are there, so not giving any clue...

jandubois avatar Oct 07 '22 16:10 jandubois

Is there any network requirements for Rancher to be able to start up or for the update-ca-certificates to work?

cyron7 avatar Oct 07 '22 18:10 cyron7

Is there any network requirements for Rancher to be able to start up or for the update-ca-certificates to work?

No, it should work fine without network. Of course you will need a network connection to download images, and to fetch the Kubernetes version you want to run, but if they already exist locally, then you can run offline.

jandubois avatar Oct 07 '22 18:10 jandubois

Ok, maybe test just an empty Alpine distro and see if that already fails, or if the issue is with stuff that gets installed later.

Can you run these commands and show if they succeed or fail:

PS C:\Users\Jan> wsl --import testing . '.\AppData\Local\Programs\Rancher Desktop\resources\resources\win32\distro-0.27.tar'
PS C:\Users\Jan> wsl -d testing update-ca-certificates
WARNING: ca-certificates.crt does not contain exactly one certificate or CRL: skipping
PS C:\Users\Jan> wsl --unregister testing
Unregistering...

The distro-0.27 filename assumes that you are trying Rancher Desktop 1.6.0 now. If you are still on 1.5.1 then the version should be 0.26.

If you get a failure from this test too (and are still on 1.5.1), please uninstall it and install 1.6.0 and try again. I kind of doubt that there are any differences, but the time has come for desperate actions...

jandubois avatar Oct 07 '22 22:10 jandubois

@jandubois ; Sorry I took so long getting back to you. I got the same error. I am using 1.5.1. I will update to 1.6.0 and see if that solves the problem: image

cyron7 avatar Oct 11 '22 15:10 cyron7

@jandubois ; I uninstalled 1.5.1 and installed 1.6.0. The same symptoms happened where I can initially get Rancher to start but after I shutdown and start up my host machine I get the update-ca-certificates error. I tried the command you suggested on 1.6.0 and got the same error with distro 27 as I did with 26: image

cyron7 avatar Oct 11 '22 18:10 cyron7

This shows that the error is not triggered with just the builtin certs from the distro, so I think this means it is related to one of the certs on your host that is being copied into the distro as Rancher Desktop starts up.

@mook-as Do you have any idea how to isolate the cert that may be triggering this?

jandubois avatar Oct 13 '22 00:10 jandubois