Performance debug and optimize `replace_type_try_transport`
Of the time spent in the new rewrite rule reification, 82.7% is spent in replace_type_try_transport (236.006s max for a single call), which is basically "match context[@type.try_transport ?base_type ?try_make_transport_base_type_cps ?p ?t ?t], replace with p t" in a loop.
I have asked about speeding this up on Zulip, but at this point I'm inclined to run another timing diff after https://github.com/mit-plv/rewriter/pull/142 and then eat the ~2--3 minute slowdown.
(I'm also out of debugging time, @andres-erbsen if you want to debug, consider applying the patch
diff --git a/src/Rewriter/Passes/NBE.v b/src/Rewriter/Passes/NBE.v
index 429a6125f..784248943 100644
--- a/src/Rewriter/Passes/NBE.v
+++ b/src/Rewriter/Passes/NBE.v
@@ -16,8 +16,14 @@ Module Compilers.
Module Import RewriteRules.
Section __.
- Definition VerifiedRewriterNBE : VerifiedRewriter_with_args true false true nbe_rewrite_rules_proofs.
+ Set Ltac Profiling.
+ Set Printing Depth 1000000.
+ Ltac2 Set Rewriter.Language.PreCommon.Pre.reify_profile_cbn := Init.true.
+ Ltac2 Set Reify.should_debug_profile := fun () => Init.true.
+ Definition VerifiedRewriterNBE : VerifiedRewriter_with_args false false true nbe_rewrite_rules_proofs.
Proof using All. make_rewriter. Defined.
+ Redirect "profile" Show Ltac Profile.
+ Redirect "profile0" Show Ltac Profile CutOff 0.
Definition default_opts := Eval hnf in @default_opts VerifiedRewriterNBE.
Let optsT := Eval hnf in optsT VerifiedRewriterNBE.
diff --git a/src/Rewriter/Rules.v b/src/Rewriter/Rules.v
index 08d572064..1c4d81e04 100644
--- a/src/Rewriter/Rules.v
+++ b/src/Rewriter/Rules.v
@@ -51,7 +51,7 @@ Definition nbe_rewrite_rulesT : list (bool * Prop)
myflatten
[mymap
dont_do_again
- [(forall A B x y, @fst A B (x, y) = x)
+ [(*(forall A B x y, @fst A B (x, y) = x)
; (forall A B x y, @snd A B (x, y) = y)
; (forall P t f, @Thunked.bool_rect P t f true = t tt)
; (forall P t f, @Thunked.bool_rect P t f false = f tt)
@@ -183,7 +183,7 @@ Definition nbe_rewrite_rulesT : list (bool * Prop)
end)
('n)
xs)
- ; typeof! @unfold1_nat_rect_fbb_b
+ ; *)typeof! @unfold1_nat_rect_fbb_b
; typeof! @unfold1_nat_rect_fbb_b_b
; typeof! @unfold1_list_rect_fbb_b
; typeof! @unfold1_list_rect_fbb_b_b
You might also want to throw a Set Printing Implicit. or Set Printing All.. If you do want to debug this, let me know if there's any other information that would be useful to expose debugging-wise. And Ltac2 Set Reify.should_debug_fine_grained := fun () => Init.true. should give access to the argument to and return from replace_type_try_transport, though it'll give a bunch of other stuff as well.)
Originally posted by @JasonGross in https://github.com/mit-plv/fiat-crypto/issues/1778#issuecomment-1868442943