charon icon indicating copy to clipboard operation
charon copied to clipboard

Remove trait indirection when the impl is known

Open Nadrieril opened this issue 1 year ago • 12 comments

Charon handles trait impls as follows: all methods definitions become top-level functions, and the trait impl simply bundles them. E.g. in:

trait Default {
  fn default() -> Self;
}
struct Struct(u32);
impl Default for Struct {
  fn default() { Struct(42) }
}

the impl becomes something like:

fn Struct_default() -> Struct { Struct(42) }
impl Default for Struct {
  default = Struct_default;
}

Whenever we call Struct::default(), this is encoded as a reference to the impl Default for Struct bundle and then to the method in that bundle. This can make definitions more recursive than they need to.

This PR adds a micro-pass that calls the top-level function directly when possible. In our example, we'd replace Struct::default() with a call to Struct_default(). This of course only works when the type is known; for generic types we need to use the trait clauses in scope as usual.

This fixes the problem raised in #159, though not the general case of #159.

Nadrieril avatar May 29 '24 15:05 Nadrieril

I think a diff like this in Eurydice should unblock you:

diff --git a/lib/AstOfLlbc.ml b/lib/AstOfLlbc.ml
index 9c1e373..5347274 100644
--- a/lib/AstOfLlbc.ml
+++ b/lib/AstOfLlbc.ml
@@ -131,8 +131,8 @@ module RustNames = struct
     parse_pattern "core::ops::index::IndexMut<[@], core::ops::range::RangeFrom<usize>>::index_mut", Builtin.slice_subslice_from;

     (* arrays *)
-    parse_pattern "core::ops::index::Index<[@T; @N], core::ops::range::Range<usize>>::index", Builtin.array_to_subslice;
-    parse_pattern "core::ops::index::IndexMut<[@T; @N], core::ops::range::Range<usize>>::index_mut", Builtin.array_to_subslice;
+    parse_pattern "core::array::{core::ops::index::IndexMut<[@T; @N], @I>}::index<u8, core::ops::range::Range<usize>, @>", Builtin.array_to_subslice;
+    parse_pattern "core::array::{core::ops::index::IndexMut<[@T; @N], @I>}::index_mut<u8, core::ops::range::Range<usize>, @>", Builtin.array_to_subslice;
     parse_pattern "core::ops::index::Index<[@T; @N], core::ops::range::RangeTo<usize>>::index", Builtin.array_to_subslice_to;
     parse_pattern "core::ops::index::IndexMut<[@T; @N], core::ops::range::RangeTo<usize>>::index_mut", Builtin.array_to_subslice_to;
     parse_pattern "core::ops::index::Index<[@T; @N], core::ops::range::RangeFrom<usize>>::index", Builtin.array_to_subslice_from;

although other patterns in this list probably need to be updated as well -- do you know how to update those patterns? I don't know how to compute them myself apart from trying to trigger the error

msprotz avatar May 31 '24 18:05 msprotz

(at least it removed the failure from my end, although I do get another, seemingly unrelated error later on)

msprotz avatar May 31 '24 18:05 msprotz

So I went pretty far in trying to update eurydice to deal with this PR. But honestly, I'm pretty stuck so I think it'd be great if you could help me fix these patterns.

Here's the part that confuses me:

-    parse_pattern "SliceIndexShared<'_, @T>", Builtin.slice_index;
-    parse_pattern "SliceIndexMut<'_, @T>", Builtin.slice_index;
-    parse_pattern "core::ops::index::Index<[@T], core::ops::range::Range<usize>>::index", Builtin.slice_subslice;
-    parse_pattern "core::ops::index::IndexMut<[@T], core::ops::range::Range<usize>>::index_mut", Builtin.slice_subslice;
+    parse_pattern "core::ops::index::Index<[@], core::ops::range::Range<usize>>::index", Builtin.slice_index;
+    parse_pattern "core::ops::index::IndexMut<[@], core::ops::range::Range<usize>>::index_mut", Builtin.slice_index;
+    parse_pattern "core::slice::index::{core::ops::index::Index<[@T], @I>}::index<@, core::ops::range::Range<usize>>", Builtin.slice_subslice;
+    parse_pattern "core::slice::index::{core::ops::index::IndexMut<[@T], @I>}::index_mut<@, core::ops::range::Range<usize>>", Builtin.slice_subslice;

This is an attempt to deal with your changes. The branches I'm using are:

libcrux: 401a3477e9f18850b8b2c57eb497a43a16b322dc (on lucas/extract-intrinsics)
charon: 031ddee4ea11aa156f9ba3b8a4e0eab1f1265cde (on fix-203-and-more)
everest/karamel: 5666a5de9c14ca0cd3098befd3070ed34ed2cfcf (on protz_trait_methods)
eurydice: fedbf6b31a0796d3cef5ce84191fe982c21884f8 (on protz_trait_clauses2)

which is the most comprehensive Eurydice test we have right now.

What confuses me is that I don't understand whether I'm doing the right thing. In particular, the new pattern for slice_index seems to mention a Range, even though it should be an index? Why a range instead of an index? Am I misinterpreting this?

I'm also hitting a new pattern: core::ops::index::IndexMut<[u8; 32], core::ops::range::Range<usize>>::index_mut and I don't know what this corresponds to. Is this an array index? But then there's FunId (FAssumed (FArrayIndex ...)) so shouldn't that be the representation instead?

To reproduce, run ./c.sh in libcrux/libcrux-ml-kem, and to debug, cat c.sh, grab the eurydice invocation manually, and add --log Calls.

I'll leave this aside for now and hopefully let's sync up on Monday if you haven't managed to fix this already. Thanks so much!

(Also CC @W95Psp who might be able to help.)

msprotz avatar May 31 '24 19:05 msprotz

I think I'm starting to understand. It now seems like the patterns are no longer canonical and I might encounter either

core::ops::index::IndexMut<[u8; @], core::ops::range::RangeFrom<usize>>::index_mut

or

core::array::{core::ops::index::IndexMut<[@T; @N], @I>}::index_mut<@, core::ops::range::RangeFrom<usize>, @>

I'm not sure if this is intentional, but it looks like I need to duplicate every pattern in my list to account for both forms. Since I don't know whether this is an intentional effect, I'll stop here, and you should be able to fix this quickly by patching my branch and re-running c.sh with every pattern in both forms. Thanks!

msprotz avatar May 31 '24 19:05 msprotz

I think I'm starting to understand. It now seems like the patterns are no longer canonical and I might encounter either

core::ops::index::IndexMut<[u8; @], core::ops::range::RangeFrom<usize>>::index_mut

or

core::array::{core::ops::index::IndexMut<[@T; @N], @I>}::index_mut<@, core::ops::range::RangeFrom<usize>, @>

I'm not sure if this is intentional, but it looks like I need to duplicate every pattern in my list to account for both forms. Since I don't know whether this is an intentional effect, I'll stop here, and you should be able to fix this quickly by patching my branch and re-running c.sh with every pattern in both forms. Thanks!

It looks like the pass which removes trait indirections misses some cases. For the patterns you copy-pasted I see:

  • core::ops::index::IndexMut<[u8; @], core::ops::range::RangeFrom<usize>>::index_mut: (I think) this is a reference to the index_mut method of the instance of trait IndexMut for arrays (this references the trait instance)
  • core::array::{core::ops::index::IndexMut<[@T; @N], @I>}::index_mut<@, core::ops::range::RangeFrom<usize>, @>: this references the top-level function in the impl block which defines the instance of IndexMut for arrays

sonmarcho avatar May 31 '24 20:05 sonmarcho

Hmm, it would be nice if patterns for the impl could also match on the methods

Nadrieril avatar Jun 03 '24 10:06 Nadrieril

I think I get what you are saying. Now, depending on whether the trait instance has been statically resolved, or whether it's referring to a trait clause in scope, I may get different kinds of pattern (an impl or a decl, respectively).

Agree that it would be nice if this could be uniform, although probably not the end of the world if I have to write both.

My main difficulty is that right now I manually run my test suite, notice that there's an assertion failure, look at my debug print above that says "the pattern for this code is XXX", try to guess what XXX is, then adjust my list of patterns in the code, then repeat until it converges.

Is there a better way?

msprotz avatar Jun 03 '24 18:06 msprotz

Hmm, it would be nice if patterns for the impl could also match on the methods

There is actually a simple way of doing that: when matching a pattern with a method, you can also additionally match it with the trait reference (+ the method name).

There is something which annoys me here, though, which is that I don't understand why both patterns appear. If the simplification works, then we should only see the method, not a projection of the method belonging to a specific trait impl.

sonmarcho avatar Jun 04 '24 12:06 sonmarcho

I think I get what you are saying. Now, depending on whether the trait instance has been statically resolved, or whether it's referring to a trait clause in scope, I may get different kinds of pattern (an impl or a decl, respectively).

Agree that it would be nice if this could be uniform, although probably not the end of the world if I have to write both.

My main difficulty is that right now I manually run my test suite, notice that there's an assertion failure, look at my debug print above that says "the pattern for this code is XXX", try to guess what XXX is, then adjust my list of patterns in the code, then repeat until it converges.

Is there a better way?

I've been thinking about it. Maybe it would work if we could automatically generate the patterns (this is what Lucas does)? For instance, we could use attributes to annotate functions/terms?... Something along the lines (not sure if it is really possible):

fn f1<T, N>(x: &[T; N]) -> usize {
  @[eurydice(array_length)]
  x.len()
}

@W95Psp how do you do exactly?

sonmarcho avatar Jun 04 '24 12:06 sonmarcho

I'm not sure to understand exactly: by pattern here you mean something to select a rust name instantiated with a given list of types?

There are two mechanisms in hax:

  • Matching names In the engine of hax, if I need to match on a Rust name, I never do that by looking up the internal strings of a DefId. Instead, I have a special crate that mentions explicitely every name I need to match in the engine. This special crate extracts to (1) an OCaml enum and (2) a match function that checks wether a Rust DefId corresponds to a name (denoted by inhabitant of the OCaml enum). This mechanism is only for matching names, it doesn't help for matching names specialized with specific types.

  • Antiquotation That's an experimental thing I tried in some branch a while back, and it's not used in hax main. Basically, I had a mechanism for writting, in OCaml, something like match e with [@rust_pat? "something::<MATCH_T>(2 + MATCH_X)"] -> ocaml_expr, where matched_T: ty and matched_X: expr are introduced and available in the expression ocaml_expr.

I guess here what would be -nice is the antiquotation thing, right? I ended up having no time to implement it in a nice way for hax :( and that's probably overkill

W95Psp avatar Jun 04 '24 12:06 W95Psp

Alright, I understand what's up, I'm working on a fix

Nadrieril avatar Jun 05 '24 11:06 Nadrieril

This is proving to be a rabbit-hole of a fix. I'll get there eventually but I have to clarify our use of generics some more first.

Nadrieril avatar Jun 17 '24 13:06 Nadrieril

Alright, thanks to https://github.com/AeneasVerif/charon/pull/468 I was able to adapt the name matcher so that it keeps matching methods identically after we inline them.

Nadrieril avatar Nov 27 '24 13:11 Nadrieril

This is awesome, thanks!

sonmarcho avatar Nov 27 '24 14:11 sonmarcho