Invalid cross-device link (os error 18) when upgrading on a docker OverlayFS
$ rustup update nightly
info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
info: latest update on 2017-08-21, rust version 1.21.0-nightly (8c303ed87 2017-08-20)
info: downloading component 'rustc'
info: downloading component 'rust-std'
info: downloading component 'cargo'
info: downloading component 'rust-docs'
info: removing component 'rustc'
info: rolling back changes
error: could not rename component directory from '/root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/etc' to '/root/.rustup/tmp/x5u5mnp0hhtywco8_dir/bk'
info: caused by: Invalid cross-device link (os error 18)
std::fs::rename() basically doesn't work on OverlayFS as far as I can tell by looking at other similar reports for various languages and projects hitting cross-device link errors on OverlayFS is boils down to using the rename syscall.
I'd like to propose wrapping the std::fs::rename() calls and if on linux detect os error 18 attempt to do a copy and delete instead. There are periodic other reports of errors like this on various platforms, the wrapper could try to handle the other OS cases too if they have a similar error code (or maybe even the same one if this is standard, I'm not sure).
Interestingly there is the bootstrap/update problem where folks who are experiencing may be unable to update their rustup install and not be able get the update that fixes the problem once there is a solution. Those folks will need to be advised to reinstall their rustup.
If the proposed solution to the problem works for the dev team, I'll attempt to provide a PR within a week of getting the go ahead.
This is relevant because some people use a common Docker image for their CI environments that may not be updated frequently enough for beta/nightly and have rustup update $desired_env in their script. Which is how I found this problem.
Spoke with @nrc and @alexcrichton on IRC and they said this seemed reasonable. I'll put forward an implementation this week.
Heyo ! Any news on this one ? I encounter this bug regularly when doing builds on dockerized CI. Let me know if there is any more info I can provide.
Looking at the sources, there already exists a wrapper function called utils::rename_file, it's used by components and transaction. Would that be a good candidate here to replace every other call to fs::rename ?
For those affected by this bug, see the renaming section of the kernel documentation.
@wraithan fs::rename inside std implies atomicity. For a renaming operation that doesn't fail, we should put it in a separate crate, as copying will likely involve locking.
Heya, thank you to @nrc for taking a look :) (https://internals.rust-lang.org/t/contributing-to-rustup-help-with-code-structure-needed/7193). I'm thinking of trying to tackle this bug, I like writing the replication test first, so would probably focus on this.; to try to inject the fault in the test and see what's what. Let me know if someone else wants to look into that as well, we can combine forces :)
Hi, I haven't had much time to finish working on this and the issue is still present for newest rustup. Let me know if anyone would like to pick this one up.
@rustbot label: +O-containers
Hey all! Any news on this? @cyplo @wraithan @workingjubilee I'm building a docker image and get "Invalid cross-device link" in the RUN rustup update nightly instruction of my Dockerfile.
@ishitatsuyuki thanks for the documentation. I see that this problem has to do with "redirect_dir" being disabled. So any idea how to enable it through the Dockerfile?
@CatarinaPedreira If you need to work around the issue, just remove the toolchain and install it again. I think it would avoid involving renaming across overlayfs boundary.
@ishitatsuyuki Thanks for the quick reply. I'll do that then, thank you :)
Copy+Delete would be exceedingly slow because the rename stuff is used in our transactional filesystem accessing code. If we had to open+open+{read,write,loop}+close+close rather than rename then our toolchain update process would become immensely slow. Perhaps we can detect that particular OS error by attempting a rename on something innocuous first, and if that fails, refuse to update a toolchain on such a filesystem. Though that would prevent the installation of new components/targets too. More thought needed, but in the short term the workaround is to either not include a toolchain in your underlying docker image, or else remove and then install the toolchain in your CI.
Thank you @kinnison !
As of rust 1.63.0 I seem to be encountering this issue again during the clippy stage. Posting the relevant log:
$ CARGO_HOME=/usr/local/cargo rustup update stable
info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu'
info: latest update on 2022-08-11, rust version 1.63.0 (4b91a6ea7 2022-08-08)
info: downloading component 'clippy'
info: downloading component 'cargo'
info: downloading component 'rust-std'
info: downloading component 'rustc'
info: removing previous version of component 'clippy'
info: rolling back changes
error: could not rename component file from '/usr/local/rustup/toolchains/stable-x86_64-unknown-linux-gnu/share/doc/clippy' to '/usr/local/rustup/tmp/1vsy16kvdse0rwk9_dir/bk': Invalid cross-device link (os error 18)
Cleaning up file based variables 00:00
ERROR: Job failed: command terminated with exit code 1
Could this have creeped back in somewhere?
No, what is happening is that you are updating toolchain across docker layers. Either or the correct toolchain in your docker build, or remove and reinstall your toolchains
I'm experiencing this issue in Fedora Linux. The same logic works nicely in Debian-based systems like Ubuntu, but the error Invalid cross-device link (os error 18) happens in Fedora. The steps to reproduce it are:
- Download Edge from https://packages.microsoft.com/repos/edge/pool/main/m/microsoft-edge-stable/microsoft-edge-stable_123.0.2420.53-1_amd64.deb
- Extract the content of the DEB file
- Try to move the resulting parent folder to a different path using
fs::rename()
@bonigarcia what does your problem have to do with rustup?
@djc I believe this problem happens in fs::rename(). If this is not the right place to discuss it, do you know where I should report it?
The rust-lang/rust issue tracker covers the standard library.
What's the status of this bug? I'm getting the exact same error when using act to run github workflows locally. Specifically it happens when my actions require the stable channel version of the toolchain with dtolnay/rust-toolchain@stable.
What's the status of this bug? I'm getting the exact same error when using act to run github workflows locally. Specifically it happens when my actions require the
stablechannel version of the toolchain withdtolnay/rust-toolchain@stable.
I think you probably already have a stable rust toolchain installed, and are trying to install another. This was my problem. I have a matrix of versions (1.82, stable and beta) that I run. Only the stable version had this error, and I used the same image (with a version of rust already installed, on all of them.
So my suggestion @ThrasherLT , try using an image you know doesn't have rust installed. @rbtcollins has it right in https://github.com/rust-lang/rustup/issues/1239#issuecomment-1212541656 methinks.
Also experiencing this error calling rustup component remove rust-docs inside a clean Docker container. Most definitely not a cross-device link. The root cause appears to be a problem renaming hardlinked directories.
MCE
# Dockerfile
FROM debian
ENV PATH=/root/.cargo/bin:$PATH
RUN apt update && apt install -y curl && \
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | \
sh -s -- -y --profile minimal
RUN rustup component add rust-docs
RUN rustup component remove rust-docs
Terminal output
$ docker build .
[+] Building 26.4s (7/7) FINISHED docker:desktop-linux
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 308B 0.0s
=> [internal] load metadata for docker.io/library/debian:latest 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> CACHED [1/4] FROM docker.io/library/debian:latest@sha256:b6507e340c43553136f5078284c8c68d86ec8262b1724dde73c3 0.0s
=> => resolve docker.io/library/debian:latest@sha256:b6507e340c43553136f5078284c8c68d86ec8262b1724dde73c325e8d3d 0.0s
=> [2/4] RUN apt update && apt install -y curl && curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | 20.6s
=> [3/4] RUN rustup component add rust-docs 5.2s
=> ERROR [4/4] RUN rustup component remove rust-docs 0.5s
------
> [4/4] RUN rustup component remove rust-docs:
0.461 info: removing component 'rust-docs'
0.470 info: rolling back changes
0.470 error: could not rename component file from '/root/.rustup/toolchains/stable-aarch64-unknown-linux-gnu/share/doc/rust/html' to '/root/.rustup/tmp/yo8q3q11xlm671mg_dir/bk': Invalid cross-device link (os error 18)
------
Dockerfile:8
--------------------
6 | sh -s -- -y --profile minimal
7 | RUN rustup component add rust-docs
8 | >>> RUN rustup component remove rust-docs
9 |
--------------------
ERROR: failed to build: failed to solve: process "/bin/sh -c rustup component remove rust-docs" did not complete successfully: exit code: 1
- Docker Desktop mac arm 4.43.2 (199162)
- Settings > General > Virtual Machine Options
- [x] Apple Virtualization framework
- [x] Use Rosetta for x86_64/amd64 emulation on Apple Silicon
- [x] VirtioFS
- [x] Apple Virtualization framework
- Settings > General > Virtual Machine Options
- Host OS: macOS 15.5 (24F74)
@skull-squadron @bonigarcia Thanks for the report! Your observation is pretty relevant in this case... Quoting the overlayFS docs:
When renaming a directory that is on the lower layer or merged (i.e. the directory was not created on the upper layer to start with) overlayfs can handle it in two different ways:
- return EXDEV error: this error is returned by rename(2) when trying to move a file or directory across filesystem boundaries. ...
... however renames are essential to how rustup currently works with updates, so I don't think it would be practical to solve this problem without a non-trivial redesign.