plutus icon indicating copy to clipboard operation
plutus copied to clipboard

[Epic] PIR generators

Open effectfully opened this issue 1 year ago • 6 comments

This is an umbrella issue for all things related to PIR generators.

effectfully avatar Oct 02 '24 13:10 effectfully

  • [ ] Remove unrenaming from the PIR generators

QuviQ had a trouble using our PIR type checker with their PIR terms generator: they had to do introduce unrenaming of escaped data type variables, we couldn't figure out a way of avoiding that and so we now have to maintain that disgusting unrenaming in there. Can we do any better? We've had quite a lot of discussion on the Plutus mailing list about the topic (thread for those who have access), but nothing really came out of it. Perhaps we're stuck with this nonsense forever.

effectfully avatar Oct 02 '24 13:10 effectfully

  • [ ] Optimize fresh name generation in the PIR generators

Currently generation of a single fresh name requires processing literally every single bit of the current context, which is absurdly inefficient. This operation should be O(1), not O(size_of_the_context).

Originally I thought we could just change the code to use Quote, but some experimentation showed that this isn’t gonna cut it:

names are generated all over the place and put into the current context, inside and outside of the GenTm monad. It’s really hard to move all of that to run in Quote simply because it’s trivial to miss a call here and there

the current fresh name generation is pretty fancy and doesn’t just always pick the next fresh name. It can pick the next 10th fresh name, then go back, then again forward etc – all of that between different calls to the function generating fresh names. We want to keep this logic as it allows us to produce more diverse terms

This ticket is for sorting everything out. How beneficial is it to make generation of fresh names properly fast? Is it worth the trouble of having another field in GenEnv and keeping it in sync with the other relevant parts of the context (of which we have plenty)? I assume the latter is the only solution allowing us to replicate the existing logic while keeping the cost low.

effectfully avatar Oct 02 '24 13:10 effectfully

  • [ ] Speed up shrinking

Shrinking is sometimes slow because a large number of shrinks are generated. See this comment.

effectfully avatar Oct 02 '24 13:10 effectfully

  • [ ] Figure out and document the shrinking order on types in the PIR type generator

Currently the shrinking order on types isn’t specified anywhere in the PIR generators making it extremely hard to:

  1. verify that the shrinker does the right thing and can’t end up being stuck in a loop
  2. change literally anything in the shrinker (because it’s hard to tell what implicit invariant a change might break)

We need to fix that by figuring out the shrinking order, documenting it and ensuring that existing code complies with it.

effectfully avatar Oct 02 '24 13:10 effectfully

  • [ ] Investigate if making fixKind return a list of types is feasible and worth it

Currently fixKind returns only one type. Meaning the shrinker always chooses, say, a single particular type variable when fixing the kind of another type variable. It would be nice if it could look at all suitable type variables. This ticket is for investigating if it’s hard to arrange that and if its' beneficial to do so.

effectfully avatar Oct 02 '24 13:10 effectfully

  • [ ] PIR generators produce operationally trivial terms

I wanted to add a very simple property test using the PIR generators, but got surprised at how slow the test turned out to be, so I've done some experiments and it seems like compiling a PIR program (using toUPlc) is 100x less efficient than what it takes to generate that program using the PIR generators. And this is given that the PIR generators are known to be asymptotically slow (and I'm sure the constants in the complexity function aren't particularly small either), although I'd say that the PIR generators are reasonably fast and aren't worth optimizing currently, given that they produce 10k terms in 4 seconds on my machine.

But that's not the best part. The best part is that compilation with all the default optimizations erases 90% to 99% of the generated term. I.e. we spend all this time generating a term, then we sweat compiling it just to be produce a minuscule and in most cases completely trivial term.

The end result is that at present the PIR generators are basically useless if you want to compile the produced term and do something with the result, it's just a waste of resources by any measure I can think of.

effectfully avatar Oct 02 '24 13:10 effectfully