rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Externally implementable traits

Open Amanieu opened this issue 1 year ago • 21 comments

An alternative proposal to #3632.

Rendered

Tracking:

  • https://github.com/rust-lang/rust/issues/125418

Amanieu avatar May 22 '24 21:05 Amanieu

Bikeshed: instead of calling it extern trait, what about extern mod? (Unlike trait, mod doesn't imply an implementing type)

Jules-Bertholet avatar May 22 '24 21:05 Jules-Bertholet

I do not know what the good answer is, but extern trait is not good if you can't use it as a trait bound. we don't need to add confusion to the language, just use a separate name.

Lokathor avatar May 22 '24 21:05 Lokathor

If we do want to stick with trait (for example, because it's what users are familiar with in terms of "specifying/implementing an interface"), I think we should stay consistent with the type-directed nature of Rust traits and return to a design similar to https://github.com/rust-lang/rfcs/pull/2492. (This reminds me of the traits vs first-class modules debate)

Jules-Bertholet avatar May 22 '24 22:05 Jules-Bertholet

Bikeshed: instead of calling it extern trait, what about extern mod? (Unlike trait, mod doesn't imply an implementing type)

I like this a lot more than using "trait." IMO it's much closer to a module since, as the RFC says (emphasis added)

An extern trait may contain functions but not other associated items such as types or constants. Additionally, these functions may not refer to self or Self in their signature. Effectively, these functions follow the same rules as free functions outside an impl block.

I also think it's quite readable if you use impl mod <extern_module_name> like

// core::panic:

extern mod PanicHandler {
    fn panic_handler(_: &PanicInfo) -> !;
}

// user:

impl mod core::panic::PanicHandler {
    fn panic_handler(panic_info: &PanicInfo) -> ! {
        eprintln!("panic: {panic_info:?}");
        loop {}
    }
}

max-niederman avatar May 22 '24 22:05 max-niederman

Also, should crates be allowed to partially impl an extern trait/mod, and let other crates impl the rest? And should there be a way to "weakly" impl such an item (permitting downstream crates to partially or fully override)?

Jules-Bertholet avatar May 22 '24 22:05 Jules-Bertholet

The problem with #2492 is that it's not something that can be resolved at link-time: it introduces circular dependencies in the type system which can't be resolved without deferring all codegen to the root crate.

Also, should crates be allowed to partially impl an extern trait/mod, and let other crates impl the rest? And should there be a way to "weakly" impl such an item (permitting downstream crates to partially or fully override)?

No, the entire point of grouping functions in a single trait is that they must all be provided together. If you want to only partially implement an extern trait then you should provide your own extern trait for the part that your crate doesn't provide and then forward your extern impl to that.

Bikeshed: instead of calling it extern trait, what about extern mod? (Unlike trait, mod doesn't imply an implementing type)

I think a case can be made for either trait or mod, since it's really something that shares some characteristics of both. Note that there are downsides to extern mod as well, such as the syntax for extern unsafe mod or impl mod looking strange compare to how mod is usually used.

Amanieu avatar May 22 '24 22:05 Amanieu

Setting aside the nits that one could pick on this proposal, the main and interesting question here seems to be whether it should be possible to express that the person who implements an extern item must uphold certain obligations (that the compiler cannot check) in order to prevent undefined behavior.

We recently covered a similar case to this in:

  • https://github.com/rust-lang/rfcs/pull/3484

There, we resolved that the person who declares the signatures within an extern block is responsible for those being correct, and that this is a separate obligation from the ones that a caller (or other user) must uphold when calling (or otherwise using) an item not marked safe within an extern.

There is conceptual overlap between that and this RFC (and the alternative proposals). Here, there's a difference between the caller having to uphold certain unchecked obligations when invoking one of these functions and the implementer having to uphold certain unchecked obligations. Given the intended and anticipated use cases for this, I can certainly see how this distinction could matter, and given that we just addressed a similar case in RFC 3484, I could see a lot of reason to be consistent here conceptually.

Due to how central this is to the motivation for this proposal, @Amanieu, I might suggest adding more discussion of this to the RFC.

traviscross avatar May 22 '24 23:05 traviscross

The problem with https://github.com/rust-lang/rfcs/pull/2492 is that it's not something that can be resolved at link-time

Considering this further, I think the analogy to extern {} blocks is the right way to think about this feature. In today's extern {} (tomorrow's unsafe extern {}), you declare a set of interfaces, such that each interface has exactly one implementation. However, the compiler can't check that said implementation exists and is unique and valid, so the declarer of the extern block must unsafely promise those things.

In this hypothetical feature, one also declares a set of interfaces with exactly one outside implementation; however, now it's the compiler's job to check that this implementation exists and has the correct signature. (There may be additional preconditions on top of that, and the interface definer should be able to specify this by requiring interface implementers to use unsafe.)

Other than the burden of unsafe proof, these two features are extremely similar; therefore their syntaxes should arguably look similar also. Perhaps something like this:

//! crate foo
extern impl {
    // Unsafe to call, safe to implement
    unsafe fn frob();
    static FOO: u32;
}

unsafe extern impl {
    // Safe to call, unsafe to implement
    fn brotzle();
}
//! crate bar
extern crate foo;

impl unsafe fn foo::frob() {
    println!("frobbing");
}

impl static foo::FOO: u32 = 42;

unsafe impl fn foo::brotzle() {
    println!("brot");
}

Jules-Bertholet avatar May 23 '24 00:05 Jules-Bertholet

@Jules-Bertholet I agree; I just posted a suggestion for how to unify the proposal in the other RFC with extern blocks: https://github.com/rust-lang/rfcs/pull/3632#issuecomment-2125972702. Looks like we are thinking along the same lines.

tmandry avatar May 23 '24 00:05 tmandry

For this RFC, I would suggest resolving the issues that have been raised by using a concrete type for the impl. That way it's "just" a trait implementation for a regular concrete type.

// core::panic:

pub trait PanicHandler {
    fn panic_handler(_: &PanicInfo) -> !;
}

pub struct GlobalPanicHandler;
extern impl PanicHandler for GlobalPanicHandler;

// user:

impl core::panic::PanicHandler for core::Panic::GlobalPanicHandler {
    fn panic_handler(panic_info: &PanicInfo) -> ! {
        eprintln!("panic: {panic_info:?}");
        loop {}
    }
}

This shouldn't inherit the problems of the other RFC, but should be compatible with it. Perhaps it can even be made strongly forward-compatible with it in the sense that it would be forward-compatible for the above code in core to switch to using extern type in the future, if the GlobalPanicHandler type contained a PhantomExternSized or similar.

And while we're at it, we could just allow extern impl for inherent impls without needing a trait at all.

tmandry avatar May 23 '24 00:05 tmandry

Looking at this, I'm going to reiterate the concern I raised in the meeting: this seems confusingly different from other uses of trait in the language, and I think it would cause more confusion than clarity.

I would love to see a real trait-based solution, but not something that's just using the word "trait" with relatively little in common with traits.

joshtriplett avatar May 25 '24 15:05 joshtriplett

The problem with #2492 is that it's not something that can be resolved at link-time: it introduces circular dependencies in the type system which can't be resolved without deferring all codegen to the root crate.

That doesn't seem like a fatal problem. We already defer generation of generics to the point where they're instantiated. If we had to defer codegen of things that depend on an external type, would that be such a substantial problem?

(Note: I am not proposing that we block other efforts while waiting for such a solution. I'm trying to envision the simplest possible implementation of full external types with trait bounds.)

joshtriplett avatar May 25 '24 15:05 joshtriplett

If we had to defer codegen of things that depend on an external type, would that be such a substantial problem?

Even if such deferral is feasible, it would be nice (for compile times) if we don't have to do it unless it's inherently necessary. If the thing being externally implemented is a plain function, deferring codegen should ideally not be forced by the design of the feature.

Jules-Bertholet avatar May 25 '24 17:05 Jules-Bertholet

Another possibility could be something like:

// core::panic:

pub trait PanicHandler {
    fn panic_handler(&self, _: &PanicInfo) -> !;
}


pub extern static panic_handler: &'static  dyn PanicHandler;

// user:

struct GlobalPanicHandler;
impl core::panic::PanicHandler for GlobalPanicHandler {
    fn panic_handler(&self, panic_info: &PanicInfo) -> ! {
        eprintln!("panic: {panic_info:?}");
        loop {}
    }
}
extern static core::panic::panic_handler = &GlobalPanicHandler;

With some optimization to avoid dynamic dispatch.

tmccombs avatar Jun 02 '24 06:06 tmccombs

Another possibility could be something like:

// core::panic:

pub trait PanicHandler {
    fn panic_handler(&self, _: &PanicInfo) -> !;
}


pub extern static panic_handler: &'static  dyn PanicHandler;

// user:

struct GlobalPanicHandler;
impl core::panic::PanicHandler for GlobalPanicHandler {
    fn panic_handler(&self, panic_info: &PanicInfo) -> ! {
        eprintln!("panic: {panic_info:?}");
        loop {}
    }
}
extern static core::panic::panic_handler = &GlobalPanicHandler;

With some optimization to avoid dynamic dispatch.

This is proposed in RFC-3635.

max-niederman avatar Jun 03 '24 01:06 max-niederman

I've always pictured dependency injection in rust to use traits, as traits are a way to abstract sets of behaviour, so this RFC out of all of them feels the most natural to me.

I've been recently dealing with the displeasure of updating our rustls to use aws-lc-rs instead of ring. A lot of libraries depend on rustls, and many have default features for using ring or aws-lc-rs. The extra annoyance is that rustls will fail at runtime if you use the default implicit provider when both ring and aws-lc-rs features are enabled, which makes it extremely error prone.

This lead me to start thinking about compile-time checked singletons and my first idea was to use traits - then someone linked me to this RFC. Thus I would like to help move this RFC forward and I thought it would be worth while adding some examples based on case studies from third party crates

rustls

rustls uses a CryptoProvider struct, which contains some &'static dyn Traits for the dependency injection, which is stored a static _: OnceLock<Arc<CryptoProvider>>.

The direct translation I imagine would look like this

// ::rustls

// with an **opt-in** feature flag since you might not want a default provider
#[cfg(feature = "default_provider")]
extern trait DefaultCryptoProvider {
    fn crypto_provider() -> Arc<CryptoProvider>;
}

impl ClientConfig {
    #[cfg(feature = "default_provider")]
    pub fn builder() -> ConfigBuilder<Self, WantsVersions> {
        Self::builder_with_provider(DefaultCryptoProvider::crypto_provider())
    }
    
    pub fn builder_with_provider(provider: Arc<CryptoProvider>) -> ConfigBuilder<Self, WantsVersions> {
        todo!()
    }
}

// ::rustls_aws_lc_rs
pub fn default_provider() -> CryptoProvider {
    todo!();
}

// ::rustls_default_aws_lc_rs
extern impl rustls::DefaultCryptoProvider {
    fn crypto_provider() -> Arc<CryptoProvider> {
        lazy(|| Arc::new(::rustls_aws_lc_rs::default_provider())).clone()
    }
}

I could also easily imagine a next iteration where the API makes more use of traits, and uses Arc<dyn CryptoProvider> instead

// ::rustls
pub trait CryptoProvider {
    fn random(&self, output: &mut [u8]) -> Result<(), Error>;
    fn load_key(&self, der: PrivateKeyDer) -> Result<Arc<dyn SigningKey>, Error>;
    fn fips(&self) -> bool;

    // ...
}

#[cfg(feature = "default_provider")]
extern trait DefaultCryptoProvider {
    fn random(output: &mut [u8]) -> Result<(), Error>;
    fn load_key(der: PrivateKeyDer) -> Result<Arc<dyn SigningKey>, Error>;
    fn fips() -> bool;

    // ...
}

#[cfg(feature = "default_provider")]
pub struct DefaultProvider;

#[cfg(feature = "default_provider")]
impl CryptoProvider for DefaultProvider {
    fn random(&self, output: &mut [u8]) -> Result<(), Error> {
        DefaultCryptoProvider::random(output)
    }

    // ...
}

impl ClientConfig {
    #[cfg(feature = "default_provider")]
    pub fn builder() -> ConfigBuilder<Self, WantsVersions> {
        Self::builder_with_provider(lazy(|| Arc::new(DefaultProvider)).clone())
    }
    
    pub fn builder_with_provider(provider: Arc<dyn CryptoProvider>) -> ConfigBuilder<Self, WantsVersions> {
        todo!()
    }
}

// ::rustls_aws_lc_rs
pub struct AwsLcRsProvider {
    fips: bool,
}

impl CryptoProvider for AwsLcRsProvider {
    fn random(&self, output: &mut [u8]) -> Result<(), Error> {
        todo!()
    }

    fn fips(&self) -> bool {
        self.fips
    }

    // ...
}

// ::rustls_default_aws_lc_rs

static FIPS: OnceLock<bool> = OnceLock::new();
fn provider() -> AwsLcRsProvider {
    AwsLcRsProvider {
       fips: cfg!(feature = "fips") && FIPS.get().unwrap_or_default()
    }
}

extern impl DefaultCryptoProvider {
    fn random(output: &mut [u8]) -> Result<(), Error> {
        provider().random(output)
    }

    fn fips(&self) -> bool {
        provider().fips()
    }

    // ...
}
rand

rand supports generic Rng sources, but also provides an OsRng source which uses the getrandom crate, which itself chooses a default based on OS and architecture, or finally falls back to linker hacks for the edge case where there's no source available.

extern trait DefaultTryRngCore {
    fn try_next_u32() -> io::Result<u32>;

    #[inline]
    fn try_next_u64() -> io::Result<u64, Self::Error>;

    #[inline]
    fn try_fill_bytes(dest: &mut [u8]) -> io::Result<(), Self::Error>;
}

pub struct DefaultTryRng;

impl TryRngCore for DefaultTryRng {
    type Error = io::Error;

    #[inline]
    fn try_next_u32(&mut self) -> io::Result<u32> {
        DefaultTryRngCore::try_next_u32()
    }

    #[inline]
    fn try_next_u64(&mut self) -> io::Result<u64> {
        DefaultTryRngCore::try_next_u32()
    }

    #[inline]
    fn try_fill_bytes(&mut self, dest: &mut [u8]) -> io::Result<()> {
        DefaultTryRngCore::try_fill_bytes(dest)
    }
}

Developers can then use the getrandom crate to implement DefaultTryRngCore. For developers of embedded devices, they can now avoid the unsafe extern fn "Rust" __getrandom_v03_custom() {} linker hacks and directly implement the safe DefaultTryRngCore for their hardware rng source.

TL;DR This extern trait design feels the closest to "traits as dependency injection", while still being compile time checked with friendly error messages (no failing at runtime like rustls, or producing extremely nasty link-time errors like getrandom)

conradludgate avatar Aug 23 '25 10:08 conradludgate

One other benefit of this method over other forms of external implementations is that it seems to work quite well for platform-specific code: if you define your trait in a crate and then make several target-cfged versions offering implementations, you don't need to worry about duck-typing based upon renaming module paths or reexporting functions, and you can literally just #[cfg] a private module and it can contain the full implementation.

I imagine that libstd itself would benefit a lot from this feature.

One potential modification I would make, however, is to make not having an implementation for a trait a deny-by-default lint instead of a hard error. If the trait is used, it results in a hard error because it can't be used, but if the trait is left unused in the build then it should be fine if it's not defined.

clarfonthey avatar Oct 13 '25 19:10 clarfonthey

If the trait is used, it results in a hard error because it can't be used

how would that work? Would the compiler generate an impl of the trait where all methods panic?

tmccombs avatar Oct 14 '25 06:10 tmccombs

If the trait is used, it results in a hard error because it can't be used

how would that work? Would the compiler generate an impl of the trait where all methods panic?

I presumed it would error at link time. If a function call to the trait is included in the final link objects, the linker will error if the function definition cannot be found. Of course we'd like to have better errors but that at least sets the stage

conradludgate avatar Oct 14 '25 06:10 conradludgate

Yeah, I was imagining it would result in at least a link-time error. Probably there are other ways to do it too, but at least we know that would make sure an error happens.

Relying on the linker for errors is already something that people do today for cheeky optimisation checks, but it's not really ideal.

clarfonthey avatar Oct 14 '25 16:10 clarfonthey

That could result in some interesting behavior where debug builds fail to link, because the include a function that calls the un-implemented trait, but release builds succeed, because that function is optimized out as dead code.

tmccombs avatar Oct 14 '25 17:10 tmccombs