chalk icon indicating copy to clipboard operation
chalk copied to clipboard

experiment with converting semantic to syntactic equality

Open jackh726 opened this issue 4 years ago • 4 comments

This issue has been assigned to @Areredify via this comment.

jackh726 avatar Mar 27 '20 20:03 jackh726

OK, it took me some time to find, but I finally found a link to the meeting where we discussed this topic in some depth. It was the meeting on 2019.10.16. So, to start, you can read those notes to get a high-level idea of what's being proposed.

I'll paste in the key idea here.

How it works today

Today, given an impl like this

impl<T: Iterator> Foo for T::Item { .. }

we would generate a clause like

forall<T: Iterator> {
    Implemented(<T as Iterator>::Item: Foo)
}

Then, when we do unification, we look for types like <T as Iterator>::Item, which represent unnormalized associated types -- we call those aliases. In unification, if we are ever asked to unify an alias A and some other type T, it always succeeds, and we generate a goal AliasEq(A = T):

https://github.com/rust-lang/chalk/blob/2bd2e039c95368634ba577a379b74be3f900a68d/chalk-solve/src/infer/unify.rs#L221-L237

So what happens is that we dynamically unfold the rule above. i.e., if we are matching against a goal like Implemented(u32: Foo), then we will match successfully and generate a subgoal AliasEq(<T as Iterator>::Item = u32) which has to be proven.

Why is this ungreat?

There are a few problems with this setup. First, as a general rule, I'd like the "solving logic" phase to be as simple as possible -- and our current unification scheme, which doesn't just unify types but also injects goals, makes things more dynamic and less amenable to (say) compilation in the future. In other words, I want the solving phase to be very close to prolog, and right now it's not.

But second, I want to create the invariant that whenever we ask for "all the impls of a trait", we have some idea what the "self type" is. This is helpful to things like rust-analyzer and rustc, since they can index impls by self-type and retrieve them more efficiently. But when we are solving a goal like <T as Iterator>::Item: Foo, they have no idea what that self-type is. Under the scheme I am contemplating, we would first be trying to normalize <T as Iterator>::Item first.

There's a third reason, but I'm not sure how much it applies given the work on the recursive solver. Still, one of the things I was exploring was a different model for region solving, and for that to work, I had to be able to separate out cases where I wanted to first regions that were syntactically equal (so 'a = 'a and nothing else) from semantically equal. In particular, it's possible to have 'a and 'b where 'a: 'b and 'b: 'a. Those two regions are semantically equivalent but syntactically distinct, and our current setup of unification -- which is based on semantic equivalence -- can't handle that.

How I would like it to work

I mentioned that today we unfold the goals dynamically as we unify things. There are two parts to the work here. The first part is that for an impl like the previous example:

impl<T: Iterator> Foo for T::Item { .. }

we would generate a clause like:

forall<T: Iterator, U> {
    Implemented(U: Foo) :- AliasEq(<T as Iterator>::Item = U)
}

The second part is that we would have to unfold goals as well. So something like

Implemented(<T as Iterator>::Item: Foo)

would become

exists<X> {
    AliasEq(<T as Iterator>::Item = X),
    Implemented(X: Foo)
}

In fact, I wrote up the transformation in some detail in a branch. I'll copy and paste that comment below for posterity.

nikomatsakis avatar Apr 15 '20 10:04 nikomatsakis

Here is the comment from my branch, just in case that file gets lost.

The "syntactic equality" lowering converts Rust semantic type equality into syntactic type equality.

Syntactic vs semantic equality

So, what is the difference between syntactic vs semantic equality? Glad you asked!

  • Semantic type equality means that the types are equivalent for Rust programmers.
  • Syntactic type equality means that the types are written using the same tokens. Syntactic equality is a subset of semantic of equality -- that is, if two types are syntactically equal, then they must be semantically equal. But it is also possible for types that are written differently to be equivalent. Examples of the some of the ways that semantic equality can differ from syntactic equality:
  • Lifetime relationships: &'a T and &'b T are equivalent if 'a: 'b and 'b: 'a
  • Higher-ranked types: for<'a, 'b> fn(&'a u32, &'b u32) and for<'a, 'b> fn(&'b u32, &'a u32) are equivalent
  • Associated type normalization: <vec::IntoIter<u32> as Iterator>::Item and u32 are equivalent

The transformation this module performs

This module takes program clauses and goals and converts them so they require only syntactic equality. It does this by introducing sub-goals for cases where semantic equality may diverge.

Example: transforming a program clause with repeated lifetimes

Consider the following trait/impl:

trait SomeTrait<T> { }
impl<'a> SomeTrait<&'a u32> for &'a u32 { }

The equivalent program clause for this impl would be:

forall<'a> {
    Implemented(&'a u32: SomeTrait<&'a u32>)
}

This program clause, however, assumes semantic equality -- that is, it uses 'a in two places, but it isn't really necessary to have the same regions in both places. It suffices to have any two regions which outlive one another. We can convert this program clause into a syntactic clause by introducing a fresh region 'b, along with a Equal subgoals:

forall<'a, 'b> {
    Implemented(&'a u32: SomeTrait<&'b u32>) :-
        Equal('a = 'b)
}

The Equal goal is built-in to the system and implements semantic equality. Equal('a = 'b) is provable if Outlives('a: 'b), Outlives('b: 'a)` is provable.

Example: transforming a program clause with associated types

Consider the following trait/impl:

trait SomeTrait<T> { }
impl<T: Iterator> SomeTrait<T::Item> for T { }

The equivalent program clause for this impl would be:

forall<T> {
    Implemented(T: SomeTrait<<T as Iterator>::Item>) :-
        Implemented(T: Iterator)
}

Again, this assumes semantic equality, as it contains <T as Iterator>::Item, but we expect this rule to apply to any type U where T::Item normalizes to U. We can express this same concept using only syntactic equality like so:

forall<T, U> {
    Implemented(T: SomeTrait<U>) :-
        Implemented(T: Iterator),
        Equal(<T as Iterator>::Item = U)
}

The pattern should look familiar: we just introduce a variable and add a new Equal subgoal to capture semantic equality. Equal on a projection type will ultimately lead to a ProjectionEq subgoal.

Example: transforming a goal with associated types

Continuing the previous example, imagine that we have a goal that we would like to prove, and that goal is expressed with semantic equality. For example, perhaps we are type-checking a call foo::<X> to this function:

fn foo<T>()
where
    T: Iterator,
    T: SomeTrait<T::Item>
{ }

In that case, we have to prove Implemented(X: Iterator) but also:

Implemented(X: SomeTrait<<X as Iterator>::Item>)

This goal assumes semantic equality. We can transform it goal into syntatic equality like so:

exists<U> {
    Implemented(X: SomeTrait<U>),
    Equal(<X as Iterator>::Item = U),
}

Generalizing: the rules to convert from syntactic to semantic equality

The general rules for converting from syntactic to semantic equality are as follows. First, we identify a set of parameters that require semantic equality. For each, we'll also have an associated set of goal(s) that express semantic equality. Let's call them "SemEq" parameters:

  • All lifetimes
  • Function types, owing to their binders
  • Alias types The rule to convert a program clause can then be expressed like so:
  • For a program clause forall<X0..Xx> { DG(P0..Pn) :- Goal0, .., GoalN }:
    • Replace each SemEq parameter Pi with a new variable Y
    • Add a subgoal Equate(Pi, Y) Therefore, if all of the parameters P0..Pn were SemEq parameters, the result would be:
forall<X0..Xx, Y0..Yn> {
    DG(Y0..Yn) :-
        Goal0, .., GoalN,
        Equal(Y0 = P0), ..., Equal(Yn = Pn)
}

The rule SemEq(G) to convert a goal G can be expressed like so:

  • For a non-domain goal, recursively convert the subgoals
    • e.g., G1, G2 becomes SemEq(G1), SemEq(G2)
  • For a domain goal DG(P0..Pn) where some of the parameters P are SemEq parameters:
    • Replace each SemEq parameter Pi with a new variable Y
    • Replace with the goal exists<Y...> { DG(..), Equate(Pi.. = Yi..) } Therefore, if all of the parameters P0..Pn were SemEq parameters, the result would be:
exists<Y0..Yn> {
    DG(Y0..Yn),
    Equal(Y0 = P0), ..., Equal(Yn = Pn)
}

nikomatsakis avatar Apr 15 '20 10:04 nikomatsakis

A few things have changed since I pursued my older syntactic-equality branch:

  • We converted to a system where we count binders and not bound variables. This simplifies some of the transformations, I believe, particularly around goals.
  • We introduced callbacks for folding goals and clauses into the folder. At least in some variants of the work I did, we didn't have that, which made "lowering" the goals/clauses from "semantic equality" to "syntactic equality" very annoying.

I know I had another branch where I actually did complete the "semantic to syntactic" transformation, but only for clauses, and it didn't quite work. I might try to go find it, but perhaps the syntactic-equality branch is enough.

nikomatsakis avatar Apr 15 '20 10:04 nikomatsakis

@rustbot claim

basil-cow avatar Apr 15 '20 13:04 basil-cow