rustup icon indicating copy to clipboard operation
rustup copied to clipboard

rustup-init hangs in armv7 docker container running on an arm64 Linux with `reqwest` backend

Open messense opened this issue 3 years ago • 29 comments

Problem

Running curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh in armv7 docker container running on an arm64 Linux hangs.

Steps

Run a docker container with --platform linux/arm/v7 on an arm64 Linux, install curl and run the sh.rustup.sh script

$ docker run --rm -it --platform linux/arm/v7 ubuntu:22.04 bash
root@668fa2822782:/# apt update && apt install curl -y
root@668fa2822782:/# curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --verbose
info: downloading installer
info: profile set to 'default'
info: default host triple is aarch64-unknown-linux-gnu
verbose: installing toolchain 'stable-aarch64-unknown-linux-gnu'
verbose: toolchain directory: '/root/.rustup/toolchains/stable-aarch64-unknown-linux-gnu'
info: syncing channel updates for 'stable-aarch64-unknown-linux-gnu'
verbose: creating temp file: /root/.rustup/tmp/dqlfqns1_reba5ms_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml.sha256'
verbose: downloading with reqwest

hangs forever at the verbose: downloading with reqwest step, use RUSTUP_USE_CURL=1 works fine

root@668fa2822782:/# RUSTUP_USE_CURL=1 curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | RUSTUP_USE_CURL=1 sh -s -- -y --verbose
info: downloading installer
info: profile set to 'default'
info: default host triple is aarch64-unknown-linux-gnu
verbose: installing toolchain 'stable-aarch64-unknown-linux-gnu'
verbose: toolchain directory: '/root/.rustup/toolchains/stable-aarch64-unknown-linux-gnu'
info: syncing channel updates for 'stable-aarch64-unknown-linux-gnu'
verbose: creating temp file: /root/.rustup/tmp/0lfud7winasgeiob_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml.sha256'
verbose: downloading with curl
verbose: deleted temp file: /root/.rustup/tmp/0lfud7winasgeiob_file
verbose: no update hash at: '/root/.rustup/update-hashes/stable-aarch64-unknown-linux-gnu'
verbose: creating temp file: /root/.rustup/tmp/zs3_t8gv_pyprmk6_file.toml
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml'
verbose: downloading with curl
verbose: checksum passed
verbose: creating temp file: /root/.rustup/tmp/zwsok2uoflekbroj_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml.asc'
verbose: downloading with curl
verbose: deleted temp file: /root/.rustup/tmp/zwsok2uoflekbroj_file
verbose: Good signature from on https://static.rust-lang.org/dist/channel-rust-stable.toml from:
verbose: from builtin Rust release key
verbose:   RSAEncryptSign/85AB96E6-FA1BE5FE - Rust Language (Tag and Release Signing Key) <[email protected]>
verbose:   Fingerprint: 108F 6620 5EAE B0AA A8DD 5E1C 85AB 96E6 FA1B E5FE
verbose: deleted temp file: /root/.rustup/tmp/zs3_t8gv_pyprmk6_file.toml
info: latest update on 2022-11-03, rust version 1.65.0 (897e37553 2022-11-02)
info: downloading component 'cargo'
verbose: creating Download Directory directory: '/root/.rustup/downloads'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/cargo-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
verbose: checksum passed
info: downloading component 'clippy'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/clippy-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
verbose: checksum passed
info: downloading component 'rust-docs'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/rust-docs-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
verbose: checksum passed
info: downloading component 'rust-std'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/rust-std-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
verbose: checksum passed
info: downloading component 'rustc'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/rustc-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
 55.9 MiB /  79.4 MiB ( 70 %)   0 B/s in  1s ETA: Unknown^C

Possible Solution(s)

No response

Notes

No response

Rustup version

rustup-init 1.25.1 (bb60b1e89 2022-07-12)

Installed toolchains

N/A

messense avatar Dec 09 '22 11:12 messense

Do they select the same sort of connection to the host? e.g. ipv4 or ipv6 ?

rbtcollins avatar Feb 22 '23 21:02 rbtcollins

I chatted with @kinnison and he sugggests that the failure is due to reqwest, which has a TLS implementation which uses per-CPU features, getting the wrong CPU type from your /proc/cpuinfo. Then the actual CPU doesn't handle things and it all just burns up in fire.

A starting point would be to compare your cpuinfo to the expected one for the hardware it is running on (or qemu is emulated, if you have cross-arch docker stuff happening).

rbtcollins avatar Feb 23 '23 18:02 rbtcollins

Host

$ cat /proc/cpuinfo
processor	: 0
BogoMIPS	: 50.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0xd0c
CPU revision	: 1

processor	: 1
BogoMIPS	: 50.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0xd0c
CPU revision	: 1
$ lscpu
Architecture:            aarch64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
CPU(s):                  2
  On-line CPU(s) list:   0,1
Vendor ID:               ARM
  Model name:            Neoverse-N1
    Model:               1
    Thread(s) per core:  1
    Core(s) per cluster: 2
    Socket(s):           -
    Cluster(s):          1
    Stepping:            r3p1
    BogoMIPS:            50.00
    Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0,1
Vulnerabilities:
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; __user pointer sanitization
  Spectre v2:            Mitigation; CSV2, BHB
  Srbds:                 Not affected
  Tsx async abort:       Not affected

Docker armv7

$ docker run --rm -it --platform linux/arm/v7 ubuntu:22.04 cat /proc/cpuinfo
processor	: 0
BogoMIPS	: 50.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0xd0c
CPU revision	: 1

processor	: 1
BogoMIPS	: 50.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0xd0c
CPU revision	: 1

messense avatar Feb 24 '23 03:02 messense

That certainly looks like docker is providing the host's cpuinfo through. This is the issue we had on reqwest - https://github.com/seanmonstar/reqwest/issues/642 which eventually boiled down to https://github.com/lxc/lxcfs/issues/553 which is what eventually fixed it in that case - I wonder if Docker needs equivalent help.

kinnison avatar Feb 24 '23 09:02 kinnison

Using RUSTUP_USE_CURL=1 still hangs for me. Any other workarounds?

rochdev avatar Mar 02 '24 23:03 rochdev

It does look like it's able to get a bit further when using RUSTUP_USE_CURL=1, but now it gets stuck at installing cargo:

info: installing component 'cargo'
verbose: creating temp directory: /root/.rustup/tmp/linv1qi09wy09f7v_dir

It stays there until the CI job eventually times out.

rochdev avatar Mar 02 '24 23:03 rochdev

@rochdev Does our new 1.27 version (https://internals.rust-lang.org/t/seeking-beta-testers-for-rustup-1-27-0/20352) work for you?

rami3l avatar Mar 03 '24 04:03 rami3l

@rami3l Not sure if I'm doing this right, but I added RUSTUP_UPDATE_ROOT=https://dev-static.rust-lang.org/rustup as an environment variable before running the command and I am still getting the same issue.

The command in question:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --verbose --default-host armv7-unknown-linux-gnueabihf --default-toolchain nightly --component rust-src

Environment variables set before running the above command:

RUSTUP_USE_CURL = '1'
RUSTUP_TOOLCHAIN = 'nightly'
RUSTUP_UPDATE_ROOT = 'https://dev-static.rust-lang.org/rustup'

rochdev avatar Mar 04 '24 02:03 rochdev

@rochdev Thanks for your report!

The thing here is that we're currently considering the removal of the cURL backend (possibly as early as in 1.28), so it's a must that we minimize the number of functionalities being broken in that change.

I happen to have an ARM64 Mac so I'll probably be able to look into this issue more deeply.

rami3l avatar Mar 04 '24 02:03 rami3l

The thing here is that we're currently considering the removal of the cURL backend (possibly as early as in 1.28), so it's a must that we minimize the number of functionalities being broken in that change.

Without the cURL backend rustup hangs even sooner, basically even before it starts downloading any dependencies. Using cURL allows it to at least get past the downloads after which it hangs at trying to install cargo.

I happen to have an ARM64 Mac so I'll probably be able to look into this issue more deeply.

I just tried locally on an M1 Mac and it works properly. The issue seems to be isolated to Linux aarch64 hosts.

rochdev avatar Mar 04 '24 02:03 rochdev

I just tried locally on an M1 Mac and it works properly. The issue seems to be isolated to Linux aarch64 hosts.

~~I meant to say that docker machine on ARM64 Macs should also count as a Linux aarch64 host. I'll see if I can reproduce this issue over there.~~

Oops, ARMv7 support is not available on ARM64 Macs (https://news.ycombinator.com/item?id=27278019), my bad.

rami3l avatar Mar 04 '24 02:03 rami3l

I tried to disable cURL and use the default reqwest backend instead with the 1.27 beta, and it also didn't change anything compared to before.

Here is the output:

info: downloading installer
info: profile set to 'default'
info: setting default host triple to armv7-unknown-linux-gnueabihf
verbose: creating update-hash directory: '/root/.rustup/update-hashes'
verbose: installing toolchain 'nightly-armv7-unknown-linux-gnueabihf'
verbose: toolchain directory: '/root/.rustup/toolchains/nightly-armv7-unknown-linux-gnueabihf'
info: syncing channel updates for 'nightly-armv7-unknown-linux-gnueabihf'
verbose: creating temp root: /root/.rustup/tmp
verbose: creating temp file: /root/.rustup/tmp/rmn_jarvvywfunwl_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-nightly.toml.sha256'
verbose: downloading with reqwest

rochdev avatar Mar 04 '24 02:03 rochdev

I tried to get the most minimal reproduction that I could, and I ended up with this which reproduces the issue:

FROM arm32v7/ubuntu:16.04

RUN apt-get update && apt-get -y install curl

RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --verbose \
  --default-host armv7-unknown-linux-gnueabihf

rochdev avatar Mar 04 '24 21:03 rochdev

For good measure I also tried the same Dockerfile but with Ubuntu 18.04, 20.04 and 22.04 and I see the exact same behaviour across all of them.

rochdev avatar Mar 04 '24 22:03 rochdev

Reported this against reqwest after checking with its maintainer that this is not a known issue:

https://github.com/seanmonstar/reqwest/issues/2157

djc avatar Mar 06 '24 13:03 djc

@djc I'm not sure it's actually an issue with reqwest though. While it hangs on reqwest when reqwest is used, it still hangs when cURL is used, but at the install step after the downloads have completed. So it seems to me like either something rustup or rust is doing that doesn't work with that setup.

rochdev avatar Mar 06 '24 16:03 rochdev

@rochdev that's fair... would still like to figure out why reqwest fails to download here.

djc avatar Mar 06 '24 16:03 djc

@djc For sure, the more eyes on this the better, but something tells me it might be the same thing that makes both hang 🤔 rustup hangs and reqwest hangs, both of which are in Rust, yet cURL works perfectly every time. Maybe there is an issue with Rust itself, or some other library than reqwest?

rochdev avatar Mar 06 '24 18:03 rochdev

@rochdev to be clear, it could definitely be that there is an issue in rustup still. If you're able to dig in more, that would be great. Maybe try enabling trace-level logging and see if you can pinpoint where the hang is happening?

djc avatar Mar 07 '24 10:03 djc

@djc Can you provide more detailed steps on how to capture the additional information you're looking for? I tried using RUST_LOG=trace but the output is the same.

rochdev avatar Mar 07 '24 16:03 rochdev

Try using RUST_LOG=trace?

djc avatar Mar 07 '24 16:03 djc

@djc Sorry yes that's what I tried, edited.

rochdev avatar Mar 07 '24 16:03 rochdev

@rami3l do you know if release builds are built with otel support built in?

djc avatar Mar 08 '24 08:03 djc

@djc No, I don't believe so, I'm afraid a custom build is required:

https://github.com/rust-lang/rustup/blob/b4b9a2e7ad260ee2158858632f297e1d2f0aaf26/ci/run.bash#L11-L26

rami3l avatar Mar 08 '24 09:03 rami3l

@rochdev would you be able to build your own with --features otel and try it with that?

djc avatar Mar 11 '24 13:03 djc

@rochdev would you be able to build your own with --features otel and try it with that?

@djc Where do I pass --features? Assume I know nothing about Rust, because I don't know all that much 😅

rochdev avatar Mar 11 '24 15:03 rochdev

You'd have to clone this repo, run cargo build --release --target <something-armv7> --features otel and somehow splice the resulting binary (from target/release) into your Docker stuff.

djc avatar Mar 11 '24 15:03 djc

@djc Was there any change recently in the dev version? I can't seem to be able to reproduce even re-running builds that were clearly failing 100% of the time.

rochdev avatar Mar 11 '24 22:03 rochdev

@djc Was there any change recently in the dev version? I can't seem to be able to reproduce even re-running builds that were clearly failing 100% of the time.

@rochdev We didn't do anything explicit on our side regarding reqwest, at least not that I know of. It could probably be a direct/transitive dependency update though.

Maybe you can use rustup --version to pinpoint for us the exact commit you built Rustup from? For example, I have this output on my machine:

> rustup --version
rustup 1.27.0+1 (46327d7ff 2024-03-11) dirty 1 modification
...

... and according to your report, neither v1.26.0 nor v1.27.0 (beta) is working for you. Is that correct?

rami3l avatar Mar 12 '24 02:03 rami3l

A month later I was never able to reproduce again, so I don't know what was causing the issue for us. It just started working properly one day.

rochdev avatar Apr 17 '24 03:04 rochdev