plutus icon indicating copy to clipboard operation
plutus copied to clipboard

PIR Generators

Open MaximilianAlgehed opened this issue 2 years ago • 78 comments

This PR introduces QuickCheck generators for PIR.

Pre-submit checklist:

  • Branch
    • [x] Tests are provided (if possible)
    • [ ] Commit sequence broadly makes sense
    • [ ] Key commits have useful messages
    • [ ] Relevant tickets are mentioned in commit messages
    • [x] Formatting, materialized Nix files, PNG optimization, etc. are updated
  • PR
    • [x] (For external contributions) Corresponding issue exists and is linked in the description
    • [x] Self-reviewed the diff
    • [x] Useful pull request description
    • [x] Reviewer requested

MaximilianAlgehed avatar Apr 27 '22 08:04 MaximilianAlgehed

What's the plan here? Are you planning to tidy this up more, or do you want a review right now?

michaelpj avatar Apr 27 '22 09:04 michaelpj

This needs a bit more tidying up before you need to look at it.

On Wed, 27 Apr 2022, 11:36 Michael Peyton Jones, @.***> wrote:

What's the plan here? Are you planning to tidy this up more, or do you want a review right now?

— Reply to this email directly, view it on GitHub https://github.com/input-output-hk/plutus/pull/4559#issuecomment-1110786794, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBPPQIMT7RM4A2VTT7LUH3VHEDC7ANCNFSM5UOH3ILA . You are receiving this because you authored the thread.Message ID: @.***>

MaximilianAlgehed avatar Apr 27 '22 09:04 MaximilianAlgehed

@michaelpj this is probably ready for someone to look at now.

MaximilianAlgehed avatar May 10 '22 07:05 MaximilianAlgehed

(I still haven't fixed the commit history, but I'll do that when we are about to merge)

MaximilianAlgehed avatar May 10 '22 07:05 MaximilianAlgehed

I wonder if it's a good idea to resolve https://input-output.atlassian.net/browse/SCP-3909 before implementing something as complex as this?

zliu41 avatar May 10 '22 14:05 zliu41

I wonder if it's a good idea to resolve https://input-output.atlassian.net/browse/SCP-3909 before implementing something as complex as this?

I don't have permission to see that ticket. Whats it about?

Anyway, we needed these generators to test another thing and @michaelpj said he wanted them here if possible. But we can also wait with merging this if there are big changes coming up that we would have to respond to anyway?

MaximilianAlgehed avatar May 10 '22 14:05 MaximilianAlgehed

I don't have permission to see that ticket. Whats it about?

Ah sorry, here's the ticket

Review and propose policy for our use of quickcheck and hedgehog

We’re using both, somewhat inconsistently. I think this is occasionally due to us relying on some property of one or the other, but it’s unclear.

It would be good to review where we’re using them and try to come up with a policy (e.g. “Prefer using Hedgehog, except in circumstances X&Y where Quickcheck is appropriate”), which could go in the style guide.

But we shouldn't have to block this PR if this is something immediately useful.

zliu41 avatar May 10 '22 14:05 zliu41

Yeah, in this instance I think since we're getting a contribution from the Quickcheck experts I'm happy for it to use Quickcheck!

michaelpj avatar May 10 '22 14:05 michaelpj

I see, that makes sense and although we are indeed QuickCheck experts we don't have a dog in that fight. Any property based testing is good property based testing after all!

The thing I would say here is that unless there is no way to include manual shrinking in hedgehog (which I suppose there really ought to be) then the choice shouldn't, on the face of it, make a huge difference.

The point about manual shrinking is key though, because built-in shrinking is probably unlikely to produce good shrinking for this kind of generator and I expect one would have to be very careful about how the generator is structured to make it work. @UlfNorell knows more about integrated shrinking than I do though, maybe he has something to say on the topic?

MaximilianAlgehed avatar May 10 '22 14:05 MaximilianAlgehed

I don't think we really have a dog in the fight either, that's why we've ended up with a mix of both :sweat_smile:

michaelpj avatar May 10 '22 14:05 michaelpj

Haha, thats how all the best technical decisions are made! :)

On Tue, 10 May 2022, 16:49 Michael Peyton Jones, @.***> wrote:

I don't think we really have a dog in the fight either, that's why we've ended up with a mix of both 😅

— Reply to this email directly, view it on GitHub https://github.com/input-output-hk/plutus/pull/4559#issuecomment-1122498819, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBPPQKQSOVVLO3Z5ABRH6DVJJZQTANCNFSM5UOH3ILA . You are receiving this because you authored the thread.Message ID: @.***>

MaximilianAlgehed avatar May 10 '22 14:05 MaximilianAlgehed

@effectfully and @michaelpj any progress on reviewing this?

MaximilianAlgehed avatar May 20 '22 07:05 MaximilianAlgehed

Sorry, I missed it. I'll review it the next week.

effectfully avatar May 20 '22 10:05 effectfully

@michaelpj I took some time to split things up. You might find it useful to look at the types bits now - inculding dependencies obviously :)

MaximilianAlgehed avatar Jun 01 '22 11:06 MaximilianAlgehed

@effectfully same for you (see above).

MaximilianAlgehed avatar Jun 01 '22 11:06 MaximilianAlgehed

I've looked into the data escaping issue and here are my conclusions (it's probably more info than you need, I'm leaving it here for future reference):

  1. the renamer is irrelevant, because at the renaming stage we don't have the type of the body of a let -- it's inferred later during type checking
  2. we do not have any special renaming in the type checker (I dropped the last occurrence of rename from the type checker a couple of days ago). I.e. we only do the general "if the type of a variable or a built-in function is returned, then rename it to keep names globally unique". There's no way we could avoid doing that
  3. we do not rename a data type when it escapes its scope. That would be a bug, because at that point the data type is represented by a free type variable and we don't rename free variables
  4. the issue you're facing is not related to the process of type checking, it's only about how the term is prepared before type checking starts. We rename the term before feeding it to the type checker. It's a pass that potentially changes all the names in the given program. All the data types are renamed, including those that escape their scopes. There's no way we could avoid doing that, given that our type checking is based on the assumption that the names are globally unique

This is how the function that you use is defined:

inferType
    :: ( AsTypeError e (Term TyName Name uni fun ()) uni fun ann, ToKind uni, HasUniApply uni, AsTypeErrorExt e uni ann, MonadError e m, MonadQuote m
       , GEq uni, Ix fun
       )
    => PirTCConfig uni fun -> Term TyName Name uni fun ann -> m (Normalized (Type TyName uni ()))
inferType config = rename >=> runTypeCheckM config . inferTypeM

First rename, then do the actual work.

I've exposed the internal PlutusIR.TypeCheck.Internal.inferTypeM in this PR (not merged yet due to CI being slow), so that you can plug your code right between the two sides of >=>.

I however understand that it may be not as easy as it sounds. It's absolutely trivial to break global uniqueness, so passing around regular Terms and expecting them to be globally unique just because you renamed something two hundreds lines above is not going to fly. For this reason the same PR adds the PlutusCore.Rename.Renamed wrapper, which allows you to pass renamed things around without accidentally breaking the premise of global uniqueness. The only way to construct a Renamed thing is via PlutusCore.Rename.getRenamed, which performs renaming. You can unwrap from Renamed whenever you feel like it, it's fine to access the underlying value, it's only impossible to construct one manually without going through getRenamed (well, not impossible, the constructor is exposed from PlutusCore.Rename.Internal, but it's best not to use it).

All this means that you'll have to either change a lot of things to receive a Renamed A instead of A (which is probably not too hard) and not add any new names after the initial renaming is done, because that would require another round of renaming changing all the existing names and invalidating the environment that you have, which defeats the whole purpose of the solution.

Or stick to the current unrenaming hack.

I would much prefer the former if you could pull it off, but I'm not sure how feasible that is (might be easy, might be nearly impossible, it's hard for me to tell).

I'm afraid I don't have a better solution for you.

Hope the reasoning is clear, but do feel free to ask for clarifications.

effectfully avatar Jun 01 '22 13:06 effectfully

@effectfully doesn't this solution presuppose we change the generators to only generate terms with globally unique names?

MaximilianAlgehed avatar Jun 02 '22 06:06 MaximilianAlgehed

@effectfully doesn't this solution presuppose we change the generators to only generate terms with globally unique names?

That would of course be completely infeasible and absolutely not worth the effort. The thing is, I don't know yet how generation is structured (investigating what's going on with data escaping was the major task this week and I had a few minor ones and I'm done working this week), so I'll hopefully be able to suggest something the next week. Sorry that my review gets postponed again.

Basically, it depends on when you compute the environments. If you compute them after generating terms, then you can generate a term, rename it and compute the environment from the result. But if you need to compute the environments and to generate terms in some interleaving fashion, then my suggestion is probably not going to work. But even in that case I think my findings may still be useful: in a renamed term all variables are by definition unique, hence you can zip the renamed term with its original version to compute the unrenaming in a general way. I think that would be a bit less hacky compared to looking at the data types.

It's something that we could implement ourselves and keep it as a library function, I think that would be handy for error messages, so that the variable in the error message and the original variable in the parsed program are the same. For Plutus Core it doesn't matter, because it's indecipherable anyway, but Plutus IR programs can actually be read, so might be worth doing for us. I'll talk with Michael about that.

effectfully avatar Jun 02 '22 09:06 effectfully

Yeah, this seems quite suspicious. Is it even correct as it stands? Probably we want the alpha-equivalence versions.

@michaelpj This is "correct enough" for the use cases we have - checking that we don't have any one-step shrink loops. It's possible we could contain these instances to the GeneratorSpec.Terms module? The tricky thing is that the code to check for one-step shrink loops otherwise becomes a bit burdensome to write - you have to implement the whole naive equality check yourself...

MaximilianAlgehed avatar Jun 09 '22 10:06 MaximilianAlgehed

I think we lost a bunch of comments from before in the move. For the first batch of merging I guess we're also going to need the GenTm monad, so I'm particularly keen to work out whether we can simplify that with something like MonadGen.

I tihnk a lot of those comments are in the PR history - just not attached to the particular code any more. If they haven't been resolved up there we can at least match them by looking at the context.

MaximilianAlgehed avatar Jun 09 '22 10:06 MaximilianAlgehed

I'm generally a fan of exposing everything unless there is some really good reason not to. I've already made the change, but re-working it shouldn't be that hard. Best regards Maximilian Algehed

On Thu, 9 Jun 2022 at 13:19, effectfully @.***> wrote:

@.**** commented on this pull request.

In plutus-core/testlib/PlutusCore/Generators/PIR.hs https://github.com/input-output-hk/plutus/pull/4559#discussion_r893380064 :

  •                     | (kb', b) <- shrinkKindAndType (Map.insert x ka ctx) (kb, b)]
    
  •  where KindArrow _ _ kb = k
    
  • TyForall _ x ka b -> [ (k, b) | not $ Set.member x (fvType b) ] ++
  •                     -- (above) If the bound variable doesn't matter we get rid of the binding
    
  •                     [ (k, TyForall () x ka' $ substClosedType x (minimalType ka) b)
    
  •                     | ka' <- shrink ka] ++
    
  •                     -- (above) we can always just shrink the bound variable to a smaller kind
    
  •                     -- and ignore it
    
  •                     [ (k, TyForall () x ka b)
    
  •                     | (_, b) <- shrinkKindAndType (Map.insert x ka ctx) (Star, b)]
    
  •                     -- (above) or we shrink the body
    
  • TyBuiltin{} -> []
  • TyIFix{} -> error "shrinkKindAndType: TyIFix"

+-- CODE REVIEW: does this exist anywhere? +inferKind :: Map TyName (Kind ()) -> Type TyName DefaultUni () -> Maybe (Kind ())

Sure... Just have to expose that module in the cabal file...

I did it for you a few days ago. Don't use right now, though, as I've simplified the interface a little, but the PR is not merged yet. I'll update your PR once mine is merged.

Also do feel free to expose modules in the cabal file. Maybe we should go ahead and just expose all of them.

— Reply to this email directly, view it on GitHub https://github.com/input-output-hk/plutus/pull/4559#discussion_r893380064, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBPPQJ7U3XBHHKWL6NI3SLVOHHLNANCNFSM5UOH3ILA . You are receiving this because you authored the thread.Message ID: @.***>

MaximilianAlgehed avatar Jun 09 '22 11:06 MaximilianAlgehed

Here's my PR to your PR. It only removes a couple of redundant calls to runQuoteT. Just trying to figure out what the workflow for tweaking your PR should be for me.

effectfully avatar Jun 14 '22 23:06 effectfully

@MaximilianAlgehed I've improved handling of builtins and pushed to your branch. Please review the commit.

effectfully avatar Jun 16 '22 13:06 effectfully

That looks kind of similar to what's in the NEAT generators, or the stuff that Ziyang did, I wonder if we can share them?

michaelpj avatar Jun 16 '22 13:06 michaelpj

That looks kind of similar to what's in the NEAT generators, or the stuff that Ziyang did, I wonder if we can share them?

I think it doesn't make sense to use NEAT for generation of Data, especially within QuickCheck.

Ziyang implemented generation for hedgehog. We might be able to use hedgehog's generators within QuickCheck, but that seems to be worth a separate investigation.

effectfully avatar Jun 16 '22 14:06 effectfully

I think we lost a bunch of comments from before in the move. For the first batch of merging I guess we're also going to need the GenTm monad, so I'm particularly keen to work out whether we can simplify that with something like MonadGen.

Since I have some familiarity with that thing, I'm going to try simplifying things with MonadGen.

effectfully avatar Jun 16 '22 15:06 effectfully

Since I have some familiarity with that thing, I'm going to try simplifying things with MonadGen.

Done. @MaximilianAlgehed please review this commit.

effectfully avatar Jun 22 '22 00:06 effectfully

Let me have a look at this in the start of next week :)

On Wed, 22 Jun 2022, 15:46 effectfully, @.***> wrote:

@.**** commented on this pull request.

In plutus-core/testlib/PlutusCore/Generators/PIR.hs https://github.com/input-output-hk/plutus/pull/4559#discussion_r903765454 :

+onSize f = local $ \ env -> env { geSize = f (geSize env) }

+-- | Default to the first generator if the size is zero (or negative), +-- use the second generator otherwise. +ifSizeZero :: GenTm a -> GenTm a -> GenTm a +ifSizeZero ifZ nonZ = do

  • n <- asks geSize
  • if n <= 0 then ifZ else nonZ

+-- | Locally set the size in a generator +withSize :: Int -> GenTm a -> GenTm a +withSize = onSize . const + +-- | Split the size between two generators in the ratio specified by +-- the first two arguments. +sizeSplit_ :: Int -> Int -> GenTm a -> GenTm b -> (a -> b -> c) -> GenTm c

Resolving this conversation, but @MaximilianAlgehed https://github.com/MaximilianAlgehed do feel free to revert the commit if you don't like it.

— Reply to this email directly, view it on GitHub https://github.com/input-output-hk/plutus/pull/4559#discussion_r903765454, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACBPPQOXGIODPB6ICV6XZ3TVQMKKRANCNFSM5UOH3ILA . You are receiving this because you were mentioned.Message ID: @.***>

MaximilianAlgehed avatar Jun 22 '22 15:06 MaximilianAlgehed

Let me have a look at this in the start of next week :)

Sure, no rush.

Please also review this commit that fixes handling of TyIFix in the type generator.

effectfully avatar Jun 23 '22 16:06 effectfully

I've run the tests ~500 times debugging an issue (more on that later) and got this once:

Failure report
    prop_genTypeCorrect:                                FAIL (0.40s)
      *** Failed! Falsified (after 483 tests):
      ty,tm = ( all (a7 :: * -> *). (all a23. unit) -> (all (a20 :: * -> * -> *). unit)
      , let data d8 | m2 where
              c9
              : (unit -> integer -> (\a8 -> bool) integer) -> (all (a2 :: * -> *). a2 (all a2. integer)) -> (all a1. all a7. (\a19 -> a7) integer) -> (\a17 -> integer) (integer -> (\a10 -> bool) bool) -> (all (a6 :: * -> *). all a5. unit -> unit) -> (bool -> (\a8 -> bool) (bool -> bool)) -> (bool -> integer -> (\a1 -> a1) unit) -> (\a18 -> integer) (unit -> (all a8. unit)) -> d8
              c13
              : (all a8. (\a8 -> unit) (all a9. unit -> a8 -> (all a4. unit -> bool -> integer))) -> ((integer -> unit) -> bool -> (\a8 -> a8) ((\a7 -> integer) ((\a8 -> integer) (bool -> integer -> integer -> integer)))) -> (all a8. (bool -> integer) -> unit) -> d8
              c1 : d8
        in
        let data d16 | m18 where
              c5
              : (integer -> bool -> d8 -> (all a5. (\a22 -> d8) d8)) -> (all a2. all a9. d8) -> (bool -> (all a5. all a20. integer)) -> (\a12 -> d8) (all a9. (\(a13 :: * -> *) -> a9) ((\a10 -> \a9 -> bool) (bool -> bool))) -> (d8 -> (all a14. (\a10 -> a14) (unit -> d8))) -> (bool -> (all (a23 :: * -> *). a23 (a23 (integer -> d8)))) -> (integer -> unit -> unit) -> (all a11. all a26. (\a32 -> a11) (all a21. integer -> d8)) -> d16
              c24
              : ((all a4. bool) -> (bool -> bool) -> integer -> (\a17 -> unit) ((\a25 -> unit) (bool -> (all a25. all a2. integer -> bool)))) -> (unit -> (unit -> d8) -> unit) -> d16
        in
        let ~f3 : (integer -> d16) -> d16
              = letrec !f3 : (\a17 -> unit) d16
                         = m2
                             c1 {unit} (\(x23 : unit -> integer -> bool) (x10
                                         : all (a2 :: * -> *). a2 (all a2. integer))
                                         (x33 : all a1. all a7. a7) (x11 : integer)
                                         (x35
                                         : all (a6 :: * -> *). all a5. unit -> unit)
                                         (x14 : bool -> bool) (x20
                                         : bool -> integer -> unit) (x38 : integer) ->
                                          ()) (\(x22 : all a8. unit) (x28
                                                : (integer -> unit) -> bool -> integer)
                                                (x33
                                                : all a8. (bool -> integer) -> unit) ->
                                                 ()) f3
                       ~f31 : (\a26 -> d16) d16 = f31
                in \(x23 : integer -> d16) ->
                     c24
                       (\(x31 : all a4. bool) (x28 : bool -> bool) (x7 : integer) ->
                          ()) (\(x21 : unit) (x12 : unit -> d8) -> ())
        in (\(x10 : all (a6 :: * -> *). bool) ->
              /\a7 :: * -> * ->
                let ~f29
                      : bool -> (all (a32 :: * -> *). all a10. integer -> a32 (a32 unit))
                      = letrec ~f7 : d16 = f7
                               ~f30 : d16 = f30
                        in (\(x14 : d16) ->
                              let ~f12 : integer = 0
                                  !f39 : d16 = f30
                                  ~f41 : d16 = x14
                              in
                              let !f26 : unit = ()
                              in error {bool -> (all (a32 :: * -> *). all a10. integer -> a32 (a32 unit))})
                             ((\(x6 : bool) ->
                                 c24
                                   (\(x19 : all a4. bool) (x33 : bool -> bool) (x43
                                     : integer) ->
                                      ()) (\(x23 : unit) (x1 : unit -> d8) -> ()))
                                (x10 {list}))
                in \(x2 : all a23. unit) ->
                     /\a6 :: * -> * -> * ->
                       let ~f4 : d16 -> bool -> a7 (all a18. integer)
                             = error {d16 -> bool -> a7 (all a18. integer)}
                       in
                       let ~f38 : all a23. bool = /\a20 -> error {bool} in x2 {unit})
             ((\(x29 : d16 -> d8) ->
                 let ~f33 : bool -> d8
                       = let !f21 : d8 = c1
                             !f7 : unit = ()
                             !f33 : unit = ()
                         in \(x26 : bool) ->
                              c13
                                (/\a8 -> ()) (\(x29 : integer -> unit) (x23 : bool) ->
                                                0) (/\a8 ->
                                                      \(x28 : bool -> integer) -> ())
                     !f19 : all a14. d8 = /\a28 -> let ~f23 : d8 = c1 in c1
                     !f6 : (\a40 -> unit) d8
                       = let ~f14 : d8 = c1
                             !f23 : d8 = c1
                         in ()
                 in /\a6 :: * -> * -> True)
                (\(x15 : d16) -> c1)) )
      Use --quickcheck-replay=7154 to reproduce.
      Use -p '/prop_genTypeCorrect/' to rerun this test only.

This is in the 74c4b46b1bf4b7ab7b5f7ba548254ec8c07f756d commit, i.e. without any of my changes.

Seems like a bug somewhere in the generation machinery?

BTW, this *** Failed! Falsified with no additional info makes for a horrible error message. Could you tweak the machinery not to throw away error messages? Very hard to debug stuff when all you get is "something's gone wrong somewhere" (I spent half an hour debugging a different problem and trying to figure out what was failing and where just to realize it was a catch-all clause in the type unifier).

And while we're here, why is there no instance like

instance Testable (Either String ()) where
    property = property . \case
        Left err -> failed { reason = err }
        Right () -> succeeded

in QuickCheck? It's like the most important one, those Bools and Maybes are not helpful at all.

effectfully avatar Jun 23 '22 17:06 effectfully