rfcs RFC: Generic member access for dyn Error trait objects

Also, I've mocked up the proposal in this repository - https://github.com/yaahc/nostd-error-poc/

Apr 01 '20 23:04 yaahc

One provide_context implementation supports multiple TypeIds, yes?

pub struct MyError<'a> {
    some_ref: &'a ..,
    /// All the rest are static
    something_any: ..,
    something_error: ..,
    something_read: ..,
}
impl<'a> Error for MyError<'a> {
    ...
    fn provide_context(&self, ty: TypeId) -> Option<&dyn Any> {
        if ty == TypeId::of::<dyn Any>() {
            &self.something_any as &dyn Any
        } else if ty == TypeId::of::<dyn Error>() {
            &self.something_error as &dyn Any
        } else if ty == TypeId::of::<dyn Read>() {
            &self.something_read as &dyn Any
        } else { None }
    }
}

You cannot make the default provide_context return Some(self as &dyn Any) whenever TypeId::of::<T>() == TypeId::of::<Self>() because doing so requires a Self: 'static bound.

Apr 02 '20 20:04 burdges

@burdges this is a general restriction on Any and downcast on Error types, I don't think it's currently possible to downcast types that aren't : 'static

The second design with the a Provider miiight work with this, not sure.

Apr 05 '20 16:04 yaahc

One thing I'm uneasy with reusing the object-provider API is the double lifetime:

fn get_context<'r, 'a>(&'a self, request: Request<'r, 'a>) -> ProvideResult<'r, 'a> {  ...  }

Having one explicit lifetime can still be managed, but having two or more correlated lifetimes becomes mentally challenging.

IMO we shouldn't need to copy the object-provider API entirely.

Firstly, the return type ProvideResult<'r, 'a> is unnecessarily complex in order to support ?. This type actually only has 2 states: fulfilled or not. This is isomorphic to a bool. So why not just make it a bool?

fn get_context<'a>(&'a self, request: Request<'_, 'a>) -> bool {
    request.provide::<Backtrace>(&self.backtrace) ||
        request.provide::<SpanTrace>(&self.span_trace) ||
        request.provide::<dyn Error>(&self.source) ||
        request.provide::<Vec<&'static Location<'static>>>(&self.locations) ||
        request.provide::<[&'static Location<'static>]>(&self.locations)
}

But type safety maybe a concern (e.g. user might have used && instead of ||), so we could absorb that bool into the Request and use the builder pattern

fn get_context<'_, 'a>(&'a self, request: &mut Request<'_, 'a>) {
    request
        .provide::<Backtrace>(&self.backtrace)
        .provide::<SpanTrace>(&self.span_trace)
        .provide::<dyn Error>(&self.source)
        .provide::<Vec<&'static Location<'static>>>(&self.locations)
        .provide::<[&'static Location<'static>]>(&self.locations);
}

In fact, unless you're only want to keep the first T1 in request.provide::<T1>(&t11).provide::<T1>(&t12);, even the boolean check is unnecessary, since LLVM is able to optimize that chain involving lots of if TypeId::of::<T>() == self.type_id() into a switch table.

Secondly, Request<'r, 'a> has two lifetimes essentially because it represents a &'r mut &'a Any. Perhaps we could shave off the outer reference? It is actually possible through an extern type. This further simplifies get_request into a single lifetime

fn get_context<'a>(&'a self, request: &mut Request<'a>) {
    ...
}

The diff is a bit long, I've pushed it to https://github.com/yaahc/nostd-error-poc/compare/yaahc:eaa45d4...kennytm:dbd5f03?w=1 for reference.

May 05 '20 09:05 kennytm

@kennytm There was a specific soundness issue which caused @mystor to add the double lifetime, though I don't remember what it was. Overall I'm definitely in favor of removing the double lifetime if possible but we should check with nika first.

Screenshot from 2020-05-05 07-02-29

May 05 '20 13:05 yaahc

Firstly, the return type ProvideResult<'r, 'a> is unnecessarily complex in order to support ?. This type actually only has 2 states: fulfilled or not. This is isomorphic to a bool. So why not just make it a bool?

entirely in favor of simplifying the result type

In fact, unless you're only want to keep the first T1 in request.provide::<T1>(&t11).provide::<T1>(&t12);, even the boolean check is unnecessary, since LLVM is able to optimize that chain involving lots of if TypeId::of::<T>() == self.type_id() into a switch table.

I do feel it is important to keep the explicit type parameter rather than manually doing the type id comparisons, assuming that's what you mean here. By using the explicit type parameter you get compile errors when they mismatch and it works as a coercion site which makes the API more convenient to use and less error prone overall.

May 05 '20 13:05 yaahc

@yaahc

I do feel it is important to keep the explicit type parameter rather than manually doing the type id comparisons, assuming that's what you mean here.

I agree that it's not a good idea to manually compare TypeId. What I mean is that, after all inlining, the generated code in provide_context() will look like

if request.type_id() == TypeId::of::<T1>() {
    /* code for T1 */
}
if request.type_id() == TypeId::of::<T2>() {
    /* code for T2 */
}
if request.type_id() == TypeId::of::<T3>() {
    /* code for T3 */
}
...

which LLVM will notice that all the type_id is immutable and optimize into

match request.get_type_id() {
    const { TypeId::of::<T1>() } => { /* code for T1 */ }
    const { TypeId::of::<T2>() } => { /* code for T2 */ }
    const { TypeId::of::<T3>() } => { /* code for T3 */ }
    ...
}

and thus the short-circuit property of ? and || are not necessary to produce the optimal code.

May 05 '20 14:05 kennytm

As I mention in that old comment, the &mut approach also requires a second lifetime, but it's hidden on the &mut itself. The original approach in the object-provider crate actually used a similar design.

The main drawback with the bare &mut Request type is unsoundness due to mem::swap. Take, for example, the following code, which can fulfill the outer request with a NonZeroUsize, despite it requesting a &'static u8 (untested, but I believe something similar can be written):

let x: NonZeroUsize = ...;
let bad_pointer = Request::with::<&'static u8>(|ra| {
    Request::with::<NonZeroUsize>(|rb| {
        mem::swap(ra, rb);
    });
    ra.provide::<NonZeroUsize>(&x);
}).unwrap();
// `bad_pointer` is invalid

I believe this unsoundness can be avoided by using Pin, and some of the PhantomData shenanigans to make the 'out lifetime invariant aren't necessary when it's behind a &mut, so it's at least a bit simpler to use.

#[repr(C)]
 pub struct Request<'out> {
    type_id: TypeId,
    _pinned: PhantomPinned,
    _marker: PhantomData<&'out ()>,
}

// ...
fn get_context<'a>(&'a self, request: Pin<&mut Request<'a>>) {
    ...
}

The remaining downside with this simplified Pin-based design is that it doesn't provide any sort of of single-use guarantee, but I think that's an acceptable trade-off for the perceived simplicity, especially as most call-sites will chain .provide method calls.

May 05 '20 14:05 mystor

@yaahc @mystor Yes there is an implicit second lifetime in &mut Request<'a>, but being implicit makes it easier to deal with — you could mostly ignore that lifetime when reading.

I think we could workaround the unsoundness by making Request::with private (pub(crate)), since Request::with is only used in <dyn Error>::context. This ensures there cannot never be two Request instances simultaneously in the same thread.

But if we have to keep Request::with public, I agree that the Pin-based design is the best solution.

(Or we should restart the progress on #1861 so we could add a thin unsized tail to Request, which also properly makes mem::swap not callable.)

May 05 '20 14:05 kennytm

Adding something like core::any::Request, and making it's soundness require that it is only used by Error seems unfortunate to me, so I'd greatly prefer the Pin solution, even if we choose not to initially stabilize a method like Request::with.

Adding a thin unsized tail would also be nice, and this exact sort of use-case is why I was interested in #1861 and #1993. I worry a bit about creating a dependency on solving thin unsized types for this RFC, though, as it has proven to be difficult to solve in a satisfying manner.

May 05 '20 15:05 mystor

I lean towards preferring a design that makes Request generally usable, but I otherwise agree with all of your changes @kennytm, so I'm going to pull in your branch and add Pin and update the RFC to match.

May 05 '20 17:05 yaahc

I actually wrote a branch for a potential 0.2.0 version of the object-provider crate which also uses Pin<&mut T> internally. I think there are a few behaviour differences between the two. https://github.com/mystor/object-provider/tree/pin_variant

edit: Opened a PR https://github.com/yaahc/nostd-error-poc/pull/2

May 05 '20 18:05 mystor

Note that @withoutboats proposed to stabilize Error::backtrace in rust-lang/rust#72981.

Jun 04 '20 15:06 kennytm

Has there been any consideration of core::task::Context in this API? It has a similar need for user-defined types, and (if they were added) would ideally use the same mechanism as Error.

It would seem the major differences are that Context would typically use a &mut reference, and also have a greater need for types that borrow from the Context.

It'd be possible to express with a trait, at a cost to ergonomics:

/// Implemented on a type to enable it to be used with `context`
trait Request<'a>: 'static {
    /// The type returned by `context` calls
    type Out;
}

Would it be possible to add a TypeId that used the same id for types with different lifetimes? The implementation wouldn't need to be public API

Nov 30 '20 10:11 Plecra

Has there been any consideration of core::task::Context in this API? It has a similar need for user-defined types, and (if they were added) would ideally use the same mechanism as Error.

Not particularly, and I'm sorry but I'm not really following you on how Context is related to this API or how it has similar needs for user-defined types. Could you explain what you mean? Is the Context in this case filling the same role as the &dyn std::error::Error type in the generic member access API?

Nov 30 '20 20:11 yaahc

Right. Futures are all tied to a specific reactor that manages their resources (Like sleeping and spawning other futures), and there have been a few proposals for giving them a generic way to communicate with the executor they're running in. The Request type added in this PR functions similarly to some of those proposals.

pub mod hook {
    pub struct Sleep {
        pub dur: Duration,
        marker: PhantomPinned,
    }
    impl Future for Sleep {
        type Output = ();
        fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
            cx.context::<dyn FnMut(Pin<&mut Sleep>) -> Poll<()>>().expect("no sleeping implementation")(self)
        }
    }
}
pub async fn sleep(dur: Duration) {
    hook::Sleep { dur, marker: PhantomPinned }.await
}

It would now be the caller's choice how to manage sleeping futures (allowing them to choose between designs like async-io, tokio, glommio, and easily mock the entire io stack).

There's no room in this PR for a sudden addition to Context, but I bring it up because the current 'static bound would heavily restrict these executor APIs. A typical backend for sleep would be expected to queue the running Future up to run again after Duration, and would need mutable access to some form of task list. Without a lifetime, it will have to use something like Arc<Mutex<Tasks>> instead of simply borrowing from its scope. This would be especially difficult in embedded async.

All this is mostly out of the RFC's scope, and I only mentioned it out of curiosity. Sorry for the confusion 😜

Dec 01 '20 16:12 Plecra

Oo, exciting and good point

There's no room in this PR for a sudden addition to Context, but I bring it up because the current 'static bound would heavily restrict these executor APIs. A typical backend for sleep would be expected to queue the running Future up to run again after Duration, and would need mutable access to some form of task list. Without a lifetime, it will have to use something like Arc<Mutex<Tasks>> instead of simply borrowing from its scope. This would be especially difficult in embedded async.

I see. I'm still a little confused on the reasoning for a 'static bound on being too restrictive. For request that only applies to the type being accessed, but it should still be possible to get a &mut T so long as T: 'static. I'm under the impression that this is unavoidable because you cannot downcast to types with lifetimes in them. I guess, could you go into more detail on a specific example that won't work with the Request proposed and show how your alternative fixes it?

Dec 01 '20 19:12 yaahc

As far as I'm aware, the 'static restriction only applies to TypeId and Any as a consequence of them only using runtime checks. Context has its own lifetime that can be used to ensure types live long enough at compile time (We can write symmetric signatures that check provide_context creates a value with the 'a lifetime, then promise the caller of context the same).

To demonstrate why arbitrary lifetimes would be useful, here's an example. I've excluded a lot of the code that would be involved in a real-world implementation, but I hope this gets the idea across. Playground link

This shows how an executor might build up a record of the events it needs to wait for (in this case, minimum_sleep). To do this, it needs to be able to modify its local state - that is, it needs a &'a mut _. It's certainly possible to use Arcs and locks to achieve this at runtime, but that would come with a performance penalty and platform restrictions.

As for "fixing" this... the trait I mentioned in https://github.com/rust-lang/rfcs/pull/2895#issuecomment-735713486 would allow us to the the TypeId of Self to uniquely identify the type constructor for Out and build types with lifetimes in them. I think it might be possible extend this and avoid needing to add the type annotations like so:

pub trait Identified {
    type Id: 'static;
}
impl<'a, T: ?Sized + 'static> Identified for &'a T {
    type Id = Ref<T>;
}
pub struct Ref<T: ?Sized>(core::marker::PhantomData<*const T>);

I'll try to flesh this out with updated signatures for provide_context and context

Dec 04 '20 20:12 Plecra

I'll try to flesh this out with updated signatures for provide_context and context

hype. Incase it helps, https://github.com/mystor/object-provider/pull/3 is the working copy of the changes in object-provider used for the new API, and is probably the best place to prototype your suggestions.

Dec 04 '20 21:12 yaahc

@Plecra I talked to @mystor about your suggestion and it turns out she had suggested essentially the same design back in august and I'd shot it down because it seemed overly complicated at the time, I was definitely mistaken.

Hows this version look to you? https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=a4c9aa1c309fede3ffaf0357ce6dc233

Dec 05 '20 04:12 yaahc

That's pretty much it, yea! When trying to create the example, I hit the small snag that Component is an unsafe trait. The implementor basically has to ensure transmute<OtherTypesWithThisIdType<'a>, Self<'a>>() is valid. It might be possible to use a derive macro for most cases, but there's also the problem of how this system interacts with variance which honestly goes over my head.

Written out like that, though, I'm surprised by how well it works 😁 I also like that it removes the distinction between value and reference types in the API.

Dec 05 '20 13:12 Plecra

That's pretty much it, yea! When trying to create the example, I hit the small snag that Component is an unsafe trait. The implementor basically has to ensure transmute<OtherTypesWithThisIdType<'a>, Self<'a>>() is valid. It might be possible to use a derive macro for most cases, but there's also the problem of how this system interacts with variance which honestly goes over my head.

Written out like that, though, I'm surprised by how well it works grin I also like that it removes the distinction between value and reference types in the API.

yea, nika originally had this version using an unsafe trait and a MBE macro to define the trait and add an assertion for invariance: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c42239eda4c3270c13397ad70ed4f71e. She went over it for a bit and was fairly confident that the unsafe trait and variance restrictions weren't necessary, but yea, the logic behind it goes way over my head too, for now... >:D

I'm fairly certain that the fact that it is written like this:

struct ResponseImpl<'a, C>
where
    C: for<'c> Component<'c>,
{
    value: Option<<C as Component<'a>>::Type>,
}

impl<'a, C> private::Response<'a> for ResponseImpl<'a, C> where C: for<'c> Component<'c> {}

Instead of like this:

struct ResponseImpl<'a, C>
where
    C: Component<'a>,
{
    value: Option<<C as Component<'a>>::Type>,
}

impl<'a, C> private::Response<'a> for ResponseImpl<'a, C> where C: Component<'a> {}

Is what guarantees that the transmute is valid, but I don't understand how yet.

Dec 05 '20 17:12 yaahc

I've put together a (super rough) simplified version of the core idea behind the proposal in https://github.com/rust-lang/rfcs/pull/2895#issuecomment-739123333 here: https://github.com/mystor/dyno.

The Request API which roughly maps to the existing object_provider crate is in the provider submodule, and is built on top of the core Tag and Tagged traits.

Dec 05 '20 20:12 mystor

Hiya! Seems like this RFC has stalled a little - what's blocking progress at the moment?

Apr 10 '21 11:04 Plecra

Just personal life events, had a surgery + vacation + just started a new job last week. I intend to update this RFC this week time permitting and I have the following changes on my todo list for that update before I'm ready to start pushing the libs team for review and approval.

[x] update downcast code to be based on dyno rather than object-provider
[ ] add an example for how the same mechanism from dyno can be used for executors
[ ] add an example use case based on traits used in rustc's compiler error reporting and formatting framework

For the second one there's a specific struct/trait I recall @anp linking to me a little while back that I felt was basically serving the same role as generic member access, though I can't recall exactly what it was atm, so the first step will be digging up the forgotten context.

Edit: Found the context for the compiler error reporting example:

https://docs.rs/codemap-diagnostic/0.1.1/codemap_diagnostic/struct.Emitter.html
https://docs.rs/codemap-diagnostic/0.1.1/codemap_diagnostic/struct.Diagnostic.html#fields

Specifically how the Diagnostic type is filling the same role as generic member access, where the fields in Diagnostic could instead be exposed through generic member access and then Emitter could just work with dyn Errors instead. I'm not sure this is desirable in this specific case because I feel like compiler errors don't end up needing the same sort of cross library compatibility that other error types do, but hopefully I can construct a useful example from it.

Apr 12 '21 19:04 yaahc

Do the parts regarding error reporting have any interaction with the proposed {:!} error formatting specifier? More generally, how does this RFC fit into the error handling blog post (https://github.com/rust-lang/blog.rust-lang.org/pull/799)? I assume this is one step on the "move Error into core" process?

Apr 12 '21 22:04 bstrie

Do the parts regarding error reporting have any interaction with the proposed {:!} error formatting specifier? More generally, how does this RFC fit into the error handling blog post (rust-lang/blog.rust-lang.org#799)? I assume this is one step on the "move Error into core" process?

Indirectly they could, depending on how we implement it. One idea for the reporting proposal is to have a fmting trait that is associated with {:!} and have a default impl for all types that implement Error, but then can be custom implemented for types that don't (e.g. anyhow::Error / eyre::Report). The default reporting would not exercise generic member access, and would only access the context needed for the common case, error messages, source, and maaaybe backtrace. The custom reporting types on the other hand would be able to hook in all of their application specific logic through their impl of that fmt trait and they could then look up any odd context that is relevant to their usecases.

Apr 12 '21 23:04 yaahc

This RFC feels very complex and I am a bit uncomfortable with adding so much API surface to std just for the Error trait.

I wonder if it is possible to offload some of the complexity of this API to crates.io, similar to how proc_macro only exposes a basic interface (just a token stream) while most of the functionality for building proc macros is located in the syn and quote crates which are not part of the standard library and can evolve independently of it.

Apr 13 '21 02:04 Amanieu

I wonder if it is possible to offload some of the complexity of this API to crates.io, similar to how proc_macro only exposes a basic interface (just a token stream) while most of the functionality for building proc macros is located in the syn and quote crates which are not part of the standard library and can evolve independently of it.

The new API is a fairly fundamental extension to the Any trait. The bulk of the API is to do with enabling dynamic downcasts including lifetimes.

There are some parts of the API that could be trimmed at the cost of ergonomics, but I think the new API should be thought of similarly to the Pin API. It's being added to enable an std API we want, and will likely be useful in implementing other crates in the ecosystem (more safe pinning/downcasts). We'd include the simpler provide_... methods for the same reason we have the Deref implementations on Pin and the debug_struct APIs (and on and on): It makes it easier for users to do the right thing

Apr 13 '21 10:04 Plecra

WRT "update RFC to be based on dyno design", can someone give a brief summary of the advantages of this new design over the old one?

Apr 13 '21 16:04 bstrie

WRT "update RFC to be based on dyno design", can someone give a brief summary of the advantages of this new design over the old one?

I think this reply summarizes it best but the main difference is dyno uses a proxy type that itself only has a phantom data, and so it is 'static and gets a TypeID. That Tag type has a trait with an associated type that is the actual type being erased, where as the old design used the type being erased's TypeID directly, requiring that it is 'static. The new design allows for closures containing mutable references to be requested as shown in this earlier comment, which could be used on a type erased executor with this same generic member access API to request a callback that then modifies state in the executor on the other side of the trait object.

Apr 14 '21 20:04 yaahc

rfcs rfcs copied to clipboard

RFC: Generic member access for dyn Error trait objects

rfcs
rfcs copied to clipboard