plutus icon indicating copy to clipboard operation
plutus copied to clipboard

Why DefaultUni is so complicated?

Open phadej opened this issue 2 years ago • 16 comments

Why cannot pair and list be represented as

DefaultUniList :: DefaultUni a -> DefaultUni [a]
DefaultUniPair :: DefaultUni a -> DefaultUni b -> DefaultUni (a, b)

Is DefaultUni (Esc a) used anywhere (where?) at any other kind than Type? In ValueOf it is at type Type, and above simplification would be massive.

EDIT: The problem is that I have no idea how to exhaustively match on DefaultUni. Indeed the plutus-core itself has (somewhat ugly):

    bring _ (f `DefaultUniApply` _ `DefaultUniApply` _ `DefaultUniApply` _) _ =
        noMoreTypeFunctions f

phadej avatar Jul 21 '22 18:07 phadej

cc @effectfully

michaelpj avatar Jul 22 '22 09:07 michaelpj

Why cannot pair and list be represented as

See this Note for how we arrived at the current representation.

Is DefaultUni (Esc a) used anywhere (where?) at any other kind than Type?

Yes, in the Plutus types. The Note has an abstract example.

EDIT: The problem is that I have no idea how to exhaustively match on DefaultUni. Indeed the plutus-core itself has (somewhat ugly):

Upvote this proposal :)

effectfully avatar Jul 22 '22 14:07 effectfully

Ok.

I still think the design is too clever for too little (I don't really believe in any non DefaultUni instantiations), but this is not my call.

phadej avatar Jul 22 '22 15:07 phadej

I don't really believe in any non DefaultUni instantiations

We were told that Plutus was going to be used on other blockchains. But even if that never happens, we still need DefaultUni as a separate entity. Consider a function like sum :: [Integer] -> Integer: in order to apply it, you need a way of extracting a [Integer] from the AST. You could have

data Constant
    = ConstantInteger Integer
    | ConstantBool Bool
    | ConstantList [Constant]

in the AST but then extracting a [Integer] from that is an O(n) operation in the length of the list, plus you practically allow heterogeneous lists, which is inappropriate for a reasonably typed language. You could do

data Constant
    = ConstantInteger Integer
    | ConstantBool Bool
    | ConstantListInteger [Integer]
    | ConstantListBool [Bool]

but that's just bad language design imo, especially when there are more types, both monomorphic and polymorphic. Like imagine having ConstantListPairDataData [(Data, Data)] (we use that currently) as one of the many constructors -- pure abomination.

So instead you'd do

data Constant
    = ConstantInteger Integer
    | ConstantBool Bool
    | forall a. ConstantList (DefaultUni (Esc a)) [a]

which is compact and allows for O(1) extraction (in the length of the list, not the depth of the type tag) of [Integer] or [Bool] or [[[Integer]]] or whatever.

So DefaultUni makes sense even without customizable universes.

And when it comes to your version of DefaultUni vs the current one, we've discussed that to death internally. We want polymorphic built-in functions and hence we need a way to apply a built-in type to a target language type variable. Allowing target language type variables to appear in the universe of built-in types is a nightmare to work with as explained in the Note that I referenced earlier, while our current solution only makes you worry about built-in types when you directly do something with them (the only exception is type normalization as far as I remember).

One thing we could do differently is separating the type-level universe from the term-level one. For the type level we care about partial applications and don't care about intrinsic typing, while for the term level it's the opposite and the current alignment just encompasses both the requirements in a single data type. We could separate the data type into two different ones, but we didn't try that and the current alignment works well anyway for us to bother duplicating the universe.

As for the noMoreTypeFunctions nonsense, we have two calls to it in the whole codebase. Plus it only exists, because pattern synonyms are not as good as they could be. Otherwise you would be able to directly and exhaustively match on DefaultUniList and DefaultUniPair without ever needing to reference DefaultUniApply when dealing with term-level stuff.

effectfully avatar Jul 22 '22 16:07 effectfully

data Constant
    = ConstantInteger Integer
    | ConstantBool Bool
    | ConstantList [Constant]

no. Use

data Constant where
    MkConstant :: IsConstantType a -> a -> Constant
    
 data IsConstantType a where
    IsUnit       :: IsConstantType ()
    IsInteger    :: IsConstantType Integer
    IsByteString :: IsConstantType ByteString
    IsText       :: IsConstantType Text
    IsBool       :: IsConstantType Bool
    IsData       :: IsConstantType Data
    IsList       :: IsConstantType a -> IsConstantType [a]
    IsPair       :: IsConstantType a -> IsConstantType b -> IsConstantType (a, b)

we need a way to apply a built-in type to a target language type variable

I'm not sure what you exactly mean by that, by AFAIU, it could be witnessed by IsConstantType a -> IsCosntantType (f a) value for built-in type (constructor) f. Note, f is not really polymorphic over all Haskell types, only IsConstantTypes, so it is not a restriction.

phadej avatar Jul 22 '22 20:07 phadej

I wasn't sure if you suggested that we could get rid of DefaultUni or merely hardcode it, so I commented on both. Your IsConstantType approach is of the latter kind and that is addressed by the Note and this part of me comment:

And when it comes to your version of DefaultUni vs the current one, we've discussed that to death internally. We want polymorphic built-in functions and hence we need a way to apply a built-in type to a target language type variable. Allowing target language type variables to appear in the universe of built-in types is a nightmare to work with as explained in the Note that I referenced earlier, while our current solution only makes you worry about built-in types when you directly do something with them (the only exception is type normalization as far as I remember).

One thing we could do differently is separating the type-level universe from the term-level one. For the type level we care about partial applications and don't care about intrinsic typing, while for the term level it's the opposite and the current alignment just encompasses both the requirements in a single data type. We could separate the data type into two different ones, but we didn't try that and the current alignment works well anyway for us to bother duplicating the universe.

So for Untyped Plutus Core your approach would work perfectly, but we have Typed Plutus Core where we need to be able to reference a built-in type in the types AST and we need to be able to apply a built-in type to an arbitrary Plutus type (not Haskell type) to give a type to a built-in Plutus function like

headPlc : all a. list a -> a

Note that here list is a built-in type that gets applied to a Plutus type variable. It's not possible to sneak a Plutus type variable into your IsConstantType, let alone an arbitrary Plutus type (supporting type variables alone would make it impossible to implement type substitution and so would be nonsensical).

We could have your IsConstantType at the term-level and mirror it at the type level in a way that allows us to express what we want there, but duplicating the universe is kinda unfortunate and we never tried it.

Also

data Constant
    = ConstantInteger Integer
    | ConstantBool Bool
    | forall a. ConstantList (DefaultUni (Esc a)) [a]

is better, because you don't want to waste space storing a type tag for Integer, Bool etc and you don't want to waste time checking those for equality (which due to the presence of recursive types is a recursive operation and so is non-trivial to inline and optimize). We've actually tried this representation, but it gave us negligible speedup, so we decided it wasn't worth it. We might try it again now that we have a way better alignment w.r.t. inlining.

Another option would be to have one universe for TPLC and one universe for UPLC (as opposed to one universe for the type level and one universe for the term level). However that would only make things more complicated, because we'd have to keep the current fancy universe for TPLC and also add another simpler one for UPLC.

tl;dr: it's Typed Plutus Core that is making things complicated. If we only had Untyped Plutus Core, we'd use a simpler universe similar to yours.

effectfully avatar Jul 23 '22 13:07 effectfully

but we have Typed Plutus Core

TBH, I don't care about typed plutus core. The code put on the blockchain is untyped, and I only care about types in representation of untyped one.

IMO, untyped plutus core should be in separate package to begin with, as typed plutus core as well as PIR are an implementation detail to get UPLC from Haskell, and if these IRs need something complicated, let that complexity be hidden in their implementation and not leak to UPLC.

phadej avatar Jul 23 '22 13:07 phadej

I've raised these questions internally, thank you for the input.

My personal view (does not reflect the view of the company) is that it's a matter of priority. I can spend my time solving plentiful tasks whose objectives are clear to me or I can spend my time duplicating universes and writing boilerplate that increases maintenance costs for some 3rd party to write fewer DefaultUniApplys. When a 3rd party creates a PR providing us with useful functionality (like e.g. this one), I'm happy to spend as much time there as needed to properly adopt the code. When it's just "hey, we want you to simplify those internals for us", I don't really care right now, if it's a non-trivial amount of effort and especially if it increases maintenance costs for the whole project.

effectfully avatar Jul 24 '22 03:07 effectfully

IMO, untyped plutus core should be in separate package to begin with, as typed plutus core as well as PIR are an implementation detail to get UPLC from Haskell, and if these IRs need something complicated, let that complexity be hidden in their implementation and not leak to UPLC.

I would like to move in the direction of separating out UPLC, but it's a large and painful refactoring that is never top priority.

michaelpj avatar Aug 08 '22 09:08 michaelpj

My previous comment was mostly about changing abstractions being rather expensive for us. Refactoring the module structure is not a big deal and I've been doing that routinely. However:

I would like to move in the direction of separating out UPLC, but it's a large and painful refactoring that is never top priority.

this increases development costs for us. We need all of the builtins machinery for TPLC, including evaluation bits (because we have the CK machine), so it has to stay there (or somewhere upstream). But we need to have access to, say, CekValue to have proper inlining of UPLC builtins evaluation bits and so tweaking builtins and then looking at how that affected the generated Core will require crossing package boundaries, which tends to be painfully slow (and was the specific reason why we ended up moving mkMachineParameters to plutus-core). Given that there are ideas on changing the builtins machinery in major ways, it's probably not the best time to increase the costs of doing that.

effectfully avatar Aug 08 '22 14:08 effectfully

I think TPLC would depend on UPLC, so most of the common stuff would exist in the UPLC package/component anyway.

michaelpj avatar Aug 08 '22 14:08 michaelpj

There's now a plausible proposal on how to fix the problem discussed in this thread (tracked in our (private) Jira as PLT-810), so I'm reopening this issue.

effectfully avatar Sep 13 '22 18:09 effectfully

This is being worked on. The action that is needed from the team is figuring out if we want to split the universe into a type-level one and a term-level one in order for DefaultUni to get simplified as proposed in the original message. Here's a branch with experiments.

effectfully avatar Jan 31 '23 22:01 effectfully

A month later, there was a bit of progress on this particular issue, but not much. We're working towards a better builtins machinery in general and the scope of this work is huge, so it may take a lot of time for us to get to actually fixing this issue. It's probably not a huge deal, since the issue is relatively low priority (not a bug, not something a regular user would have to workaround etc), just wanted to give you an update.

effectfully avatar Mar 05 '23 03:03 effectfully

Progress report: we've basically settled on separating the type-level universe from the term-level one and I've already started working on making it a reality, but I'm constantly getting distracted by other stuff and it'll take a while for this issue to get resolved.

effectfully avatar Jun 16 '23 14:06 effectfully

Very little tangible progress has been made on this one in the past several months, unfortunately.

effectfully avatar Nov 22 '23 11:11 effectfully