rust-clippy icon indicating copy to clipboard operation
rust-clippy copied to clipboard

New lint: potentially unwinding caller-supplied operators in `unsafe` code

Open Kixunil opened this issue 2 years ago • 21 comments

What it does

If a non-unsafe generic function containing unsafe block uses an operator that is implemented by the caller (via core::ops::*) the lint warns that the operator may unwind and suggests catch_unwind or calling the method explicitly to signal the programmer thought about it.

Related: #3915 See also: https://github.com/rust-lang/rfcs/pull/3288#discussion_r922871828

Lint Name

suspicious-operators-in-unsafe

Category

suspicious

Advantage

  • Forces the programmer to think about unwinds
  • Signals to the reviewers "yes, I did think about unwinds" (assuming it's also justified in SAFETY comment)

Drawbacks

It's quite possible there will be a bunch of FPs. Maybe it'll have to be off by default.

Example

impl<T> Foo<T> where T: core::ops::Add {
    pub fn foo(&mut self, val: T) {
        unsafe {
            self.break_invariants();
           // oops: if this unwinds `self` ends up in invalid state
            self.a = self.b + val;
            self.restore_invariants();
        }
    }
}

Could be written as:

impl<T> Foo<T> where T: core::ops::Add {
    pub fn foo(&mut self, val: T) {
        unsafe {
            self.break_invariants();
           // prevent the caller from observing invalid state if `+` unwinds
            let result = catch_unwind(|| self.a = self.b + val);
            self.restore_invariants();
            result.unwrap();
        }
    }
}

or:

impl<T> Foo<T> where T: core::ops::Add {
    pub fn foo(&mut self, val: T) {
        unsafe {
            self.doesnt_break_invariants();
           // safe to call add because the invariants are not broken
            self.a = self.b.add(val).
            self.doesnt_affect_invariants();
        }
    }
}

Kixunil avatar Jul 18 '22 10:07 Kixunil

Would the lint need to consider -Cpanic=abort?

ojeda avatar Jul 18 '22 10:07 ojeda

Libraries typically can't choose their panicking mode (even if a crate had panic=abort, the library part could be used as a dependency in a panic=unwind binary of another workspace), so I think it's fair to expect libraries to always consider unwinding.

kornelski avatar Jul 18 '22 10:07 kornelski

I would expect libraries to appear that only consider -Cpanic=abort environments, and a lint should be mindful of all cases, especially if it could be considered as enabled-by-default as mentioned. There is also non-library code to take into account.

There is value on making developers aware of the issue, even if they are writing -Cpanic=abort code (e.g. it may be that they are writing a library and they intend it to be usable outside that environment).

However, if the lint's message mentions e.g. "the operator may unwind" as suggested, it may be confusing for developers writing -Cpanic=abort code (e.g. they could think something is misconfigured). Thus even if the lint is enabled on -Cpanic=abort, it would be valuable to adjust the message nevertheless to make it clear it cannot actually happen in this particular configuration.

ojeda avatar Jul 18 '22 11:07 ojeda

panic=abort has been available since Rust 1.10, and I haven't seen an abort-only library. I guess your concern is related to Rust in Linux environment? But I hope Rust isn't getting fragmented into two dialects that work by different rules.

The lint could observe #[cfg(panic = "abort")]. Outside of that I do not think that ignoring panics could be considered sound in Rust, because Rust doesn't give guarantee that code will never see an unwind.

kornelski avatar Jul 18 '22 12:07 kornelski

I understand the worry about having two dialects -- it is very valid. However, even if there were no abort-only library today:

  • Rust 1.10 is only ~6 years old and rapidly growing in many places. We may see many more libraries (and use cases) appearing in the next 6 years than in the previous 6.

  • It is impossible to know what private organizations and projects are doing. It wouldn't surprise me if there are already companies with non-trivial codebases written with -Cpanic=abort in mind.

  • There are some supporters of making -Cpanic=abort the default, e.g. in the related RFC: https://github.com/rust-lang/rfcs/pull/3288#issuecomment-1175720584 @CAD97. That could make more libraries to assume no unwinding because e.g. implementation is easier.

  • Currently in Rust for Linux we work with -Cpanic=abort. Moreover, somebody proposed to share code between user/kernel space already too, so some of that code could potentially be written with -Cpanic=abort in mind. It would be nice to have a way to prevent mistakes.

To be clear, I am not trying to advocate for any side here regarding defaults or "what Rust should be". I am raising the -Cpanic=abort point because I think the lint design should take it into account.

because Rust doesn't give guarantee that code will never see an unwind.

What do you mean by this? Do you mean something like "the compiler may still use unwinding under the hood to implement some feature"? Or do you mean something like "dependencies do not control the build profile" or "Rust/Cargo/crates.io do not provide a way to flag a library as -Cpanic=abort-only"

ojeda avatar Jul 18 '22 15:07 ojeda

Discussion about unwind soundness is a bit off topic:

Removal of unwind from Drop is insufficient to remove the risks of unwinding that this lint addresses. Deref can unwind. Index can unwind. All other overloaded operators can unwind. Closures can unwind. I don't think there are any plans to remove all of these other risks.

Current Rust semantics are that unwind may happen, but is not guaranteed to happen. Note that it's different from a guarantee that an unwind will not happen.

Libraries that only support panic=unwind are sound, even if they don't work as intended, because panic=abort will not let unsound code run.

Libraries that only support panic=abort risk being unsound, because panic=unwind may continue to run code and use pointers that are invalid and meant to be stopped by an abort that didn't happen.

What do you mean by this?

I mean definition what is sound in the language (i.e. whether Rust code is correct) is independent from policies of individual projects. "We only use panic=abort setting" doesn't absolve Rust libraries from handling unwinding.

There's one Rust language, without situational exceptions. Rust doesn't relax meaning Send and Sync even if you compile for single-threaded WASM. Rust doesn't relax atomics even if you compile for x86 only. And similarly I think Rust shouldn't relax unwind-safety rules even if you compile for use in the Linux kernel. So "may unwind" is correct, because correct Rust code should be written as if unwinding could happen.

but your original concern can be addressed either by checking the flag or adding help: well, actually, there's panic=abort, so technically yada yada

kornelski avatar Jul 18 '22 16:07 kornelski

There are some supporters of making -Cpanic=abort the default, e.g. in the related RFC: Don't allow unwinding from Drop impls rfcs#3288 (comment) @CAD97.

That's a big stronger than I intended; I more meant that while I'm arguing against -Zpanic-in-drop=abort being mandatory, I would not argue against -Cpanic=abort being the default. It's a (my own) overstatement to say I'm in favor of -Cpanic=abort being the default. If nothing else, it's a breaking change.


So in general I'm in favor of more allow-by-default clippy::restriction lints for controlling unwinds. Warning on operators and silencing by calling the function/method form is clever.

But I also think saying operators unwinding is subtle/invisible the way drop unwinding is subtle/invisible is significantly overstating. Normal use of operators (e.g. i32 + i32) can unwind. Calling an operator unwinding isn't an unusual thing; it is a thing that is common and that libraries (should) know that they have to deal with implicitly (if they know about the unwind safety concept) because operators from std unwind under normal conditions.

The btw-operators-can-unwind lint thus shouldn't be limited to downstream types' operators; it should lint against any operator which is not known guaranteed to not unwind. (This is not an analysis of the function; this is a property of the signature having #[nounwind] or extern "C" or other some marker guaranteeing it.)

CAD97 avatar Jul 18 '22 16:07 CAD97

but your original concern can be addressed either by checking the flag or adding help: well, actually, there's panic=abort, so technically yada yada

Not sure what you exactly mean by "checking the flag", but extra text wouldn't hurt (or a link if it is too long).

As for the rest:

But it is still misleading when compiling under `-Cpanic=abort`. That is why I would say something like "may unwind when compiled under...".

Removal of unwind from Drop is [...]

Not sure why you bring all this up -- nobody has argued otherwise.

Libraries that only support panic=abort risk being unsound, because panic=unwind may continue to run code and use pointers that are invalid and meant to be stopped by an abort that didn't happen.

I mean definition what is sound in the language (i.e. whether Rust code is correct) is independent from policies of individual projects. "We only use panic=abort setting" doesn't absolve Rust libraries from handling unwinding.

Correctness is all about the requirements. If some Rust code is only ever going to be used under a -Cpanic=abort setting, and does what is expected of it, it is correct even if it does not handle unwinding, because that was not a requirement to begin with.

Rust's soundness: if we take the definition from https://doc.rust-lang.org/stable/reference/behavior-considered-undefined.html, then it is unclear how compilation options play with it. If one cannot cause UB with any safe code compiled with the required/documented/intended flags, should we consider it unsound? One could definitely read "any safe client" in the widest sense possible, and call it unsound. But such an interpretation may not be too useful for projects that intentionally disable unwinding and may want to design libraries where unwinding does not play a role. At any rate, the definition could be clarified.

But in any case, even if the definition includes all compilation options, one can still create sound libraries by simply refusing to compile under the wrong setting:

#[cfg(not(panic = "abort"))]
compile_error!("Designed for abort-only environments only. Please read the docs!");

There's one Rust language, without situational exceptions. Rust doesn't relax meaning Send and Sync even if you compile for single-threaded WASM. Rust doesn't relax atomics even if you compile for x86 only. I think Rust shouldn't relax unwind-safety rules even if you compile for use in the Linux kernel.

But nobody is changing the language by refusing to compile, even going with your interpretation of soundness.

So "may unwind" is correct, because correct Rust code should be written as if unwinding could happen.

Sound Rust code, maybe, depending again on how soundness is defined (and even then, you can refuse to compile). But correct, no.

As I mentioned above, I understand the worry about splitting the ecosystem, having libraries that cannot be used everywhere etc. But code is not incorrect just because your environment cannot use it; and if some projects or entities decide it is better for them to write -Cpanic=abort code, nobody can really do anything about that.

ojeda avatar Jul 18 '22 19:07 ojeda

soundness offtopic

I'd like to add to my previous statements: you could rely on panic=abort being set if you marked your functions as unsafe. The unsafe fn is a place where you can impose additional requirements on the callers, and it's callers fault if they don't do whatever you ask them to.

You couldn't do that in regular non-unsafe fn. These functions are supposed to be foolproof, and their soundness rules are not set by you, but by Rust. You'd need to make them sound (not necessarily correct or useful) for panic=unwind too. It could be something crude like #[cfg(not(panic=abort))] { abort(); }, but you couldn't let your code break safety if compiled with a wrong panic setting.

If one cannot cause UB with any safe code compiled with the required/documented/intended flags, should we consider it unsound?

This has been debated when actix-web had functions not marked as unsafe that Rust considered unsound in theory, but were not actually vulnerable in practice in actix-web when used the way author of actix-web intended. The conclusion was that it's still not valid. For safe functions soundness requirements are universal, and can't be changed by the author/project, or ignored because a particular usage context doesn't have negative consequences. There's no opt-out. Safe Rust functions must uphold soundness even if users use them deliberately incorrectly in direct contradiction of author's advice (e.g. Rust's collections guarantee they won't cause UB even if you use them with types that have nonsensical (but non-UB) Eq, Hash or Deref). This is very different from C/C++ philosophy where soundness can be based on actual usage, and the author may refuse to guarantee soundness when the library is used incorrectly.

But nobody is changing the language by refusing to compile

I didn't mean in terms of stopping compilation, because when unsafe is involved you can make unsound things compile anyway. I mean in terms of what is considered sound.

I've mentioned WASM, because it's a case where Rust's soundness rules require more than the platform typically does. Rust wants globals to be thread-safe always, for every project on every platform. However, almost every WASM project is strictly single-threaded. "But my program is always single-threaded" is a fine justification in C, but not a thing in Rust. If a library for WASM only was documented as "for single-threaded use only!" and had a safe API that could — purely in theory and never in practice — made to use Rc or Cell from more than one thread, it would be unsound. Rust requires single-threaded programs to still observe Rust's rules for multi-threaded programs. And similarly, I'm saying that Rust requires panic-aborting programs to still observe Rust's rules for unwinding-allowing programs.

kornelski avatar Jul 18 '22 20:07 kornelski

off-topic

The terminology I have seen used the most is that a library is e.g. "sound under -Cpanic=abort" when relying on certain flags for soundness. A library is sound (by the letter of the law, if not the spirit) if it fails to compile when compiled with flags that would make it unsound.

This is in practice the case for more library crates than you'd expect! A lot of crates will either refuse to compile or sometimes just be sound with pointer_width = "16". This is generally considered acceptable and falls under the umbrella of the "portability lint;" portability between compiler flags (and not just targets) also falls under the domain of the portability lint just the same.

The difficulty of being unsound with some compiler flags shouldn't be understated, though: if you just compile_error! for panic = "unwind", but then a new release of the compiler adds more options (e.g. panic=unwind-native, and panic=unwind uses a Rust-custom unwinding ABI) then you're unsound until you forbid the new option as well.

For a library to truly be properly sound, it needs to be sound under all build options present and future. If your API is not sound under some present or future option, then you need to gate it to show up only for options which you know are sound.

All of this will hopefully be clearer and easier to talk about if/when the portability lint becomes a thing.

Especially if this is a allow-by-default opt-in clippy::restriction lint (which I strongly think it should be), it doesn't really matter if it saying "this operation may unwind" is technically incorrect under -Cpanic=abort, because authors who require -Cpanic=abort will just not turn on the lint. Or if they do, they do because they want to know about the unwinding sites because they're auditing the code to relax that requirement.

CAD97 avatar Jul 18 '22 22:07 CAD97

More "off-topic" @kornelski

I'd like to add to my previous statements: you could rely on panic=abort being set if you marked your functions as unsafe. The unsafe fn is a place where you can impose additional requirements on the callers, and it's callers fault if they don't do whatever you ask them to.

You couldn't do that in regular non-unsafe fn. These functions are supposed to be foolproof, and their soundness rules are not set by you, but by Rust. You'd need to make them sound (not necessarily correct or useful) for panic=unwind too. It could be something crude like #[cfg(not(panic=abort))] { abort(); }, but you couldn't let your code break safety if compiled with a wrong panic setting.

I am well aware what unsafe means, thank you.

In fact, what you suggest here (calling abort()) is a strictly worse solution than what I did in the previous message, since you are moving the issue to runtime.

This has been debated when actix-web had functions not marked as unsafe that Rust considered unsound in theory, but were not actually vulnerable in practice in actix-web when used the way author of actix-web intended. The conclusion was that it's still not valid. For safe functions soundness requirements are universal, and can't be changed by the author/project, or ignored because a particular usage context doesn't have negative consequences. There's no opt-out.

That project allowed to introduce UB with safe code, regardless of target properties.

Unless you are talking about something else, that discussion isn't relevant here.

Safe Rust functions must uphold soundness even if users use them deliberately incorrectly in direct contradiction of author's advice (e.g. Rust's collections guarantee they won't cause UB even if you use them with types that have nonsensical (but non-UB) Eq, Hash or Deref).

I am aware of that. I sent the PR to add that to the Reference :)

This is very different from C/C++ philosophy where soundness can be based on actual usage, and the author may refuse to guarantee soundness when the library is used incorrectly.

I agree many C/C++ developers have differing views on UB, but not all.

In fact, I have been working for a while on showing the value of having a similar soundness concept in the C standard in WG14 and also its UBSG (spare time permitting).

I didn't mean in terms of stopping compilation, because when unsafe is involved you can make unsound things compile anyway. I mean in terms of what is considered sound.

I think you haven't understood my argument, or I don't understand what you are trying to reply to here.

My point was simple: even with the widest "any safe client" interpretation, you can still make sound, abort-only libraries by e.g. refusing to compile. That doesn't break any language rules as far as I am aware of.

I've mentioned WASM, because it's a case where Rust's soundness rules require more than the platform typically does. Rust wants globals to be thread-safe always, for every project on every platform. However, almost every WASM project is strictly single-threaded. "But my program is always single-threaded" is a fine justification in C, but not a thing in Rust.

If you are not able to trigger UB with any safe client, it would be a perfectly fine justification in Rust too.

If a library for WASM only was documented as "for single-threaded use only!" and had a safe API that could — purely in theory and never in practice — made to use Rc or Cell from more than one thread, it would be unsound. Rust requires single-threaded programs to still observe Rust's rules for multi-threaded programs. And similarly, I'm saying that Rust requires panic-aborting programs to still observe Rust's rules for unwinding-allowing programs.

Which rules would be broken by the suggested approaches, exactly?

Let me try to summarize the discussion: in my reading, you apparently claimed abort-only code is (or should be considered) non-existing ("I haven't seen an abort-only library"), unsound ("I do not think that ignoring panics could be considered sound in Rust") and incorrect ("correct Rust code should be written as if unwinding could happen").

And my reply was that abort-only code almost definitely exists (if not today, tomorrow); it can be sound (even with the strictest interpretation of "any safe client", e.g. if you refuse to compile otherwise); and that code may be correct even if it is unsound (fn main() { unreachable_unchecked!(); } may be correct, depending on the requirements).

If you agree that abort-only libraries are fine as long as e.g. they refuse to compile if unwinding is enabled, then there is no disagreement on my part (modulo concerns about an ecosystem split etc.).

ojeda avatar Jul 19 '22 11:07 ojeda

Even more off-topic @CAD97

The terminology I have seen used the most is that a library is e.g. "sound under -Cpanic=abort" when relying on certain flags for soundness. A library is sound (by the letter of the law, if not the spirit) if it fails to compile when compiled with flags that would make it unsound.

Exactly. My only comment was that it could be a good idea for the Reference to add a bit of clarification on what constitutes "any safe client" and "any safe code", in particular around flags and target properties etc.

For instance, for libraries in crates.io, "any safe client" should most likely be understood as anybody using the library, which includes potentially all flag combinations and targets. Thus one needs to refuse compilation (or similar) to be properly sound (and, even if that was not the definition, one should do it anyhow to prevent mistakes).

But someone can also argue then that inside something like an internal kernel library, "any safe client" would be e.g. drivers, and those cannot cause UB even if that library does not include a compile_error!-just-in-case because they cannot be compiled with unwinding enabled to begin with.

To be clear, I am happy with the strict interpretation and calling that also "unsound". I just would like that the Reference establishes it. Maybe a footnote or something?

The difficulty of being unsound with some compiler flags shouldn't be understated, though: if you just compile_error! for panic = "unwind", but then a new release of the compiler adds more options (e.g. panic=unwind-native, and panic=unwind uses a Rust-custom unwinding ABI) then you're unsound until you forbid the new option as well.

Indeed, this reason is exactly why I was careful to use the not(panic = "abort") form in one of the messages above.

it doesn't really matter if it saying "this operation may unwind" is technically incorrect under -Cpanic=abort, because authors who require -Cpanic=abort will just not turn on the lint.

Yeah, seems fair. I just wanted that -Cpanic=abort is considered in the lint design, one way or the other.

I would still tweak it to "this operation may unwind under -Cpanic=unwind" or similar when -Cpanic=abort happens to be enabled, since I guess it wouldn't be hard to add, but I agree it is not a big deal.

ojeda avatar Jul 19 '22 11:07 ojeda

@ojeda additional message will bring noise either way. Maybe not too bad but maybe clippy could detect #[cfg(not(panic="abort"))] compile_error!() (and perhaps also warn if you do it the other way around). The problem with encouraging this practice (and I think this is what @kornelski is trying to say) is that it makes the libraries less useful, potentially forcing people to set panic = abort.

OTOH maybe it's OK, I've seen libraries that deliberately don't support 16 bit systems because they handle network messages larger than 2^16 bytes long. If handling unwinds is completely impractical (internal kenel libraries) forcing them to do it is just worse.

Kixunil avatar Jul 19 '22 12:07 Kixunil

@Kixunil I agree it makes them less useful, of course -- it is a fact that less users would be able to use them. I also acknowledged the ecosystem split concern.

But that does not mean specialized libraries should be shunned (e.g. called "incorrect") just because they are not useful for everybody.

Most libraries will anyway still support -Cpanic=unwind because libraries typically want to be useful for as many users as possible.

But if, somehow, the majority of libraries suddenly started to only provide support for -Cpanic=abort because it is easier for them that way to write unsafe code, or because it gives better codegen or for any other reason, then it is a decision of the ecosystem.

ojeda avatar Jul 19 '22 12:07 ojeda

@Kixunil I don't mean they would be just less useful. I mean they would be unsound.

I mean that every library has to preserve soundness when compiled with panic=unwind. It can be broken useless junk, but it has to do something to be sound. Soundness is an objective property that always applies, not something that is context-dependent and can be opted out by the code author. The same way Rust requires all code to be thread-safe, even if the code doesn't support being run on a platform that has threads. It's valid to call abort() or compile_error!, but it's not valid to do nothing and cause unsafety/UB in a non-supported configuration.

kornelski avatar Jul 19 '22 13:07 kornelski

@kornelski I believe @ojeda claims that compile_error! behind cfg testing for panic = abort is sound which seems to match what you say and what I believe to be true. I guess the difference is that @ojeda suggested checking compiler flag while the correct behavior would be to detect compile_error behind (correct) cfg.

Kixunil avatar Jul 19 '22 13:07 Kixunil

There exist more than a few clippy lints that are only applicable most of the time, so not including this one or adding extra verbosity to the diagnostic on the basis of a possible particular set of circumstances that we don't know even exists in practice seems strange.

eaglgenes101 avatar Jul 21 '22 23:07 eaglgenes101

I guess the difference is that @ojeda suggested checking compiler flag while the correct behavior would be to detect compile_error behind (correct) cfg.

No, I didn't suggest that. In fact, I was the one that suggested the #[cfg(not(panic = "abort"))] compile_error! approach...

ojeda avatar Jul 22 '22 17:07 ojeda

adding extra verbosity to the diagnostic

There is no extra verbosity for the common case with what I suggested.

possible particular set of circumstances that we don't know even exists in practice

unsafe code gets compiled under -Cpanic=abort all the time and there are abort-only projects out there. I would say chances are quite high that people see this if the lint ends up being enabled by default (which is mentioned as a possibility in OP).

seems strange

Being less confusing and more correct does not seem like a "strange" reason to me.

ojeda avatar Jul 22 '22 17:07 ojeda

@ojeda I mean for clippy to detect that pattern as opposed to flag being set. If that's what you had in mind, then fine, just your repeated -Cpanic=abort made me think you want to detect the flag, not the code.

Kixunil avatar Jul 22 '22 18:07 Kixunil

So there is the lint discussion and the soundness discussion (which is why we were all hiding the latter as off-topic).

For the former, yes, my suggestion was about the flag (more precisely, the panic strategy of the compilation session), i.e. if panic=abort, append something like "...under panic=unwind" to the message for clarity purposes.

For the latter, my #[cfg(not(panic = "abort"))] compile_error! suggestion was about showing that abort-only code can be made sound easily (regardless of whether we use the lax or strict interpretation).

ojeda avatar Jul 22 '22 19:07 ojeda