rustup icon indicating copy to clipboard operation
rustup copied to clipboard

Failure removing component. Ok, now what? --force does not help

Open gnzlbg opened this issue 7 years ago • 16 comments

error: failure removing component 'rustc-x86_64-apple-darwin', directory does not exist: '"share/man/man1/rustdoc.1"'

gnzlbg avatar Aug 14 '18 13:08 gnzlbg

I'm also having this problem and it doesn't seem to be obvious what to do next or what's causing the issue

thomashorrobin avatar Oct 21 '18 14:10 thomashorrobin

having done a bit more Googling I think this is the answer to this problem: https://github.com/rust-lang-nursery/rustup.rs/issues/1167#issuecomment-367061388

thomashorrobin avatar Oct 21 '18 14:10 thomashorrobin

I've ran into this, too:

failure removing component 'rustc-x86_64-unknown-linux-gnu', directory does not exist: '"share/doc/rust/COPYRIGHT"'

I think it'd be sensible to treat absence of the directory to delete as a success.

kornelski avatar Jun 02 '19 18:06 kornelski

Looking at the code:

  1. There's a race condition: https://github.com/rust-lang/rustup.rs/blob/a54051502ad49c7e10c029d869ddd98064a29e31/src/dist/component/transaction.rs#L267

  2. It'd be better to try to rename first, and handle not-found error. But the io::Error is not returned directly. https://github.com/rust-lang/rustup.rs/blob/a54051502ad49c7e10c029d869ddd98064a29e31/src/utils/utils.rs#L656 Is it normal to walk the chain in such cases? Or would it be better to modify the error type to include the io::ErrorKind?

  3. There's no way to do nothing. Would it be better to fake-delete a directory, or add NoOp change? Or change the function to return Option<ChangedItem>? https://github.com/rust-lang/rustup.rs/blob/a54051502ad49c7e10c029d869ddd98064a29e31/src/dist/component/transaction.rs#L185

kornelski avatar Jun 02 '19 23:06 kornelski

@rbtcollins On the off chance you might have some of this in your head already -- What do you think? Could this resolve some more of our squiffy installs?

kinnison avatar Jun 03 '19 08:06 kinnison

So, user solution here: I think the thing to do is to remove the toolchain and re-install: gets rid off all the detritus and gets a clean slate. Getting the selected component list and making that easy to re-select might be a convenience if there isn't a nice way already.

In terms of code, handling this specific point case. The code driving this doesn't know what to today with deleting a file that doesn't exist. In particular I don't think that we discriminate in the transaction system between delete-so-that-we-can-upgrade and delete-so-that-we-can-uninstall-a-component and deleteing-a-toolchain.

In the first case we want to leave the system working if interrupted; in the second the same but we never want that particular content back, and in the third there's whole trees we no longer want.

And the underlying problem here is a design problem: neither the transaction system nor the rustup update system have been built to be crash-expectant. Of course rust itself won't crash, but libc can, the OS itself can, and hardware can (e.g. power failures), not to mention of course virus scanners or people can willfully remove content. So for instance we uninstalling a toolchain could be about 5ms in duration: moving the entire tree to a place we don't want, forking a worker to delete it lazily and returning to the user immediately (and gcing that directory on future invocations too in case the worker is interrupted). Failing that we could parallelise the walking of subdirectories and eliminate syscall latencies and - well you see the point: the transaction system is beautiful but not well suited to all that we use it for.

Back to this specific case. (1) This isn't so much a race condition as a violated invariant: the contents of the disk don't match what rustup expects. 3) So how should we teach rustup to expect it. What will have the least side effects. I don't know offhand. remove_file appears to be only used in an uninstall case, and a removed file clearly cannot be rolled-back-from, but its also not a situation we created.

I think perhaps asking these questions may help: should we tell the user? Does the user need to know?

My hunch is that the user doesn't need to know when the overall operation is successful. We've converged on the desired state, and even though thats not the exact model we have today, I think its a very good model for installer software to have. If the overall operation doesn't work, given we have this rollback concept that is going to leave things not working, I think a message to the user would be valuable - but not one per file. Something like one per file to the log file (so verbose notifications - for diagnostics) and one single item at info level saying that there were missing files which mean that the system is incomplete after rollback. Or something like that.

And the remaining specific q's 2) with error_chain the io error is still preserved I believe, just needs to be unwrapped to get at it - whether its usual to walk to it or not I can't say - I'm still new here myself.

Hope this all helps, Rob

rbtcollins avatar Jun 04 '19 08:06 rbtcollins

Oh, I didn't answer (3) after all that - since all the uses of remove_file seem to be uninstall, and I can't see any reason that uninstall would abort when the file they are removing isn't there, I don't think there is any reason to change the signature, but yes by all means return a NothingChanged or some such.

rbtcollins avatar Jun 04 '19 08:06 rbtcollins

Thanks for the answers.

With 1 I wasn't clear. I didn't mean the cause of this bug is a race condition. I mean that every use of path.exists() followed by another disk operation that depends on the (non)existence of the path is a race condition in general, because the path could be created or deleted after exists() but before the operation that follows it. For some software that may even be a security issue. Here that's unlikely, but still eliminating of exists() and checking errors from rename() will make it atomic and faster, since it avoids the extra disk access.

kornelski avatar Jun 04 '19 21:06 kornelski

Oh sure; uhm yes - TOCTTOU bugs are rife :). I've been focusing on install performance, not uninstall (and thus not update) so far - the syscall traces from that were quite tragic :).

rbtcollins avatar Jun 05 '19 04:06 rbtcollins

I think https://github.com/rust-lang/rustup/issues/1167 and https://github.com/rust-lang/rustup/issues/836 can be closed as duplicates of this issue (this one has the most detail)

I have admin rights so I could take care of this myself, but I'm not a regular maintainer here so I don't want to step on anyone's toes :) <3

carols10cents avatar Dec 23 '19 15:12 carols10cents

any solution for this issue?

jawwadturabi avatar Apr 07 '20 19:04 jawwadturabi

No fundamental solution no. In https://github.com/rust-lang/rustup/issues/1480#issuecomment-498570921 Robert said that the workaround is to remove and then reinstall the toolchain, because removal of a toolchain is a plain recursive delete rather than a component-aware transaction.

kinnison avatar Apr 16 '20 09:04 kinnison

Removing toolchain method from #1480 (comment) works for me

jawwadturabi avatar Apr 16 '20 19:04 jawwadturabi

Two easy, short term suggestions related to #2800 :

  • "file does not exist" instead of "directory does not exist" in the error message
  • When an update fails, we can give the user the suggestion to uninstall and reinstall. Giving the user details on how to do that, is very helpful.

Luc108 avatar Jun 21 '21 07:06 Luc108

And more details about #2800:

The file bin/cargo-clippy.exe was indeed missing. No idea how that's possible. Not done anything special or advanced. Only default install and maybe an update.

I uninstalled everything ("rustup self uninstall") and installed the same version again. This time the file was there.

Luc108 avatar Jun 21 '21 07:06 Luc108

I have same issue. I just deleted the .rustup directory, tried again and it worked :)

2giosangmitom avatar Oct 09 '23 11:10 2giosangmitom