Maybe swap lifetime location for `*Kind` enums
rust-marker uses kind enums to represent different node types, like items. These currently hold references with the 'ast lifetime inside them. This might not be ideal for drivers like RA, which may change and remove nodes after a linting pass. It might be better to remove the lifetime. One option for this could be to store the struct directly in the enum and only pass enum references to the API. This still ensures that users can only access immutable references while removing the 'ast lifetime. However, this will affect the size of *Kind enums, as they will have the size of the largest variant and not just a reference.
This is an idea worth exploring before v1.0.0
See
-
ItemKindenum: https://github.com/rust-marker/marker/blob/2a11448d69d8b4a44ccf479ed6c1cedf7abe0143/marker_api/src/ast/item.rs#L70-L85 - simplified example: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=226921c32d1c412338efe99a41efa78b
I'm thinking if we could just officially store the MarkerContext in a private thread local and get rid of 'ast lifetimes threaded through basically all the code. Any method that needs the MarkerContext would get it via with_cx. As I understand there are already quite a lot of methods that do this.
I think we may still guarantee safety for the users if we make the thread local a RefCell<Option>, which becomes None once linting finishes, and any further attempts to access it would result in a panic.
This could also remove the need for passing MarkerContext everywhere.
Do you propose to make the thread local accessible from lint crates as well?
I think the <'ast> lifetime would still be required in a lot of places, basically everywhere where Marker stores a reference, or might store them in the future. I'm considering to cache the Spans from the API at one point, so every node inside ast would need a lifetime for that anyways. We could look, if there are some places, where we can remove it though
My main point is that we already have quite a lot of methods that use with_cx internally to obtain a reference to MarkerContext. What if we say that every method that requires a MarkerContext could remove that parameter from its API and instead use with_cx?
Removing the proliferation of 'ast would be cool and simplify the user experience a lot. I'm not sure which blockers there are in particular for this if there are any. You mentioned that you'd like to use the MarkerContext for caching, but I don't think having with_cx() used everywhere would be a problem for this. Maybe anything else?
I'm thinking now of the API that doesn't involve users to even have MarkerContext in public. All methods of MarkerContext could be made into free functions that use with_cx internally.
For example React JS does this. It has a bunch of implicit global state and provides developers with free functions like useState and an implicit global virtual DOM.
Hmmm, one question would be, how we reconstruct the 'ast lifetime. We could transmute all references to 'static and just never dealloc the memory, before the lint crates are unloaded. But I'm not sure, that this is the best idea. Expanding the lifetime to 'static also prevents us from making any lifetime and allocation changes in the future. The other option would be some reference counting or boxes. :thinking:
I guess this would be the biggest blocker :thinking:
how we reconstruct the 'ast lifetime
What do you mean by this? I don't understand what problem this is trying to solve.
Could you point out how getting rid of the 'ast lifetime may lead to some unsoundness?
Anything, stored in a static item, can only use the 'static lifetime. In with_cx the first parameter is used to reconstruct the 'ast lifetime. On Marker's side, we can be sure that everything lives less than 'ast. But if a user gains access, they could pass in a &[] and retrieve a &'static MarkerContext<'static>. Calling cx.ast().expr() would then give them ExprKind<'static>. Which would be unsound, unless we ensure that all data lives until the end of linting.
My understanding is, that we're also limited, when it comes to the usage of smart pointers. For one, they don't have a stable ABI, but we could work around that. But lint crates are compiled separately from the driver. Therefore, it could happen, that they use different allocator. AFAIK, deallocating memory with a different allocator than the allocator that created the pointer, is unsound. So we're restricted to either passing &'lifetime pointers.
Today the static stores a RefCell<&'static MarkerContext>. I wonder if it could store the MarkerContext by value?
The problem is the 'static stored in the generic parameters. RefCell<&'static MarkerContext> is equivalent to RefCell<&'static MarkerContext<'static>> AFAIK. Even if we store MarkerContext by value, it would mean that each access returns a &'static MarkerContext. Or am I wrong?
Yep. Anyway, I think this matter requires experimentation rather than discussions